Skip navigation.

Harold's Home

   Home > PHP > Mail on 404
  
PHP Scripts

Index
CLI fun
Mail on 404
HB-NS (NewsScript)

Downloads

Applescripts
APOD to Desktop
Dreamweaver Extensions

Stuff

Writings
Other stuff
Central Grinder

OOOk Default:

VJ stuff
VJ Tools
Bananas
Strippers
Sample Movies

Mail on 404

This may be pretty basic for some visitors but it's a cool trick nonetheless. We all know that websites are living, changing collections of documents. That means that every so often a page will be moved or deleted.

A problem with this is that the web is just that, a web of pages interlinking. Now if you move a page you'll probably have control over all your sites' internal links and will be able to update those links to reflect the change, however there are possibly lots of external pages linking to the old location and tracking them down can be a real pain.

The little snippet of PHPcode below will send an email to you every time a document is requested that is no longer there. This will allow you to track where outdated links are, giving you the possibility of taking action. Just make a custom error page in normal HTML and put the php snippet at the top.

The code

<?php
$HTTP_REFERER = $_SERVER['HTTP_REFERER'];
$HTTP_USER_AGENT = $_SERVER['HTTP_USER_AGENT'];
$REMOTE_ADDR = $_SERVER['REMOTE_ADDR'];
$REQUEST_URI = $_SERVER['REQUEST_URI'];
$SERVER_NAME = $_SERVER['SERVER_NAME'];

$count = 0;
// don't send mail if we don't know the referring page
// it's probably a searchengine or infected windows system requesting known NT server exploits
if ($HTTP_REFERER == "")
{
$count++;
}
// debug comment, you can remove this line if you want:
echo "<!-- count: $count -->\n";
if ($count == 0)
{
// see table 1 on http://www.php.net/manual/en/function.date.php for the meaning of these codes:
$today = date("j F Y, G:i:s");

$message = "Date and Time: $today\nRequest URL: http://$SERVER_NAME$REQUEST_URI\nReferring page: $HTTP_REFERER\n\nClient: $HTTP_USER_AGENT\nRemote IP: $REMOTE_ADDR\n\n";
$message .= "This is an automated message, to reply is futile. This message is sent every time a non-existant document is requested and we know where the visitor came from.\n\nHave a nice day.";
mail("webmaster@$SERVER_NAME", "Error 404", $message, "From: webmaster@$SERVER_NAME\nReply-To: webmaster@$SERVER_NAME");
}
?>

Download the code: .tar.gzip .zip .sit

Internals

There's a little twist to this script. In testing an early version I found that there are a lot of search engine spiders out there that will continue to request removed pages, leading to loads of mail every time they visit. Now we have no direct control over this so this e-mail isn't really useful, also there are a lot of infected windows machines out there that will regularly request strange files like the following snippet from my error log shows.

xxx.xxx.xxx.xxx - - [05/Aug/2002:03:54:56 +0200] "GET /scripts/root.exe?/c+dir HTTP/1.0" 404 620 "-" "-"
xxx.xxx.xxx.xxx - - [05/Aug/2002:03:54:56 +0200] "GET /MSADC/root.exe?/c+dir HTTP/1.0" 404 620 "-" "-"
xxx.xxx.xxx.xxx - - [05/Aug/2002:03:54:56 +0200] "GET /c/winnt/system32/cmd.exe?/c+dir HTTP/1.0" 404 620 "-" "-"
xxx.xxx.xxx.xxx - - [05/Aug/2002:03:54:56 +0200] "GET /d/winnt/system32/cmd.exe?/c+dir HTTP/1.0" 404 620 "-" "-"
xxx.xxx.xxx.xxx - - [05/Aug/2002:03:54:56 +0200] "GET /scripts/..%255c../winnt/system32/cmd.exe?/c+dir HTTP/1.0" 404 620 "-" "-"

If we were to receive an e-mail every time some infected windows box requested known exploits for a certain badly coded serverplatform out there <cough>microsoft</cough> we'd quickly be out of minds.
Luckily both requests from spiders and infected boxes have one thing in common (at the time of writing anyway), they don't include a referring page. So we can halt those by looking if the request includes a referring page. If not we increment a counter and don't send the e-mail.

How-to redirect to your custom errorpage

Now this code would be quite useless unless we are actually able to call it. Luckily most server platforms are able to redirect users to a friendly errorpage of our own making. More information about redirecting to custom errorpages can be found on the 404 researchlab site.

Just make sure the directive points to a PHP file and not a plain HTML file and contact your ISP or hosting company if you can't figure out where and how to make your own custom error page come up, I really am busy enough.

Download the code: .tar.gzip .zip .sit

XML version of this site
About, copyright, privacy and accessibility | Mail