Welcome! You can probably guess what three things this entry is about. When you already know about computers, the Internet and blogs, this is of course the first thing you’d feel a desperate need to Google for, of course! Err…wait, Magic 2^16 Ball is telling me you just found this phrase in your serverlogs and got curious just WTF it is. And why all these people are searching for it (in fact, so many that a nonzero number of them are clicking through to your blog, which goes under a completely different name and may be only tangentally related to any of these topics, if at all), and yet this hotly sought-after site has somehow managed to fly under the radar of every major search engine.
Rude bot business cards
Computers internet blog
in my server logs
Many Web stats programs (including Analog, which I use) have this feature where they parse the referrer field for hits from major search engines, and list the top search queries people used to find your site. I’ve been seeing the phrase pop up in my blog’s stats semi-regularly lately, sometimes upwards of 25 hits a day. So tonight I got curious, pulled the logs and had a closer look.
64.22.110.34 - - [10/Oct/2007:05:34:16 -0700] "GET /?p=365 HTTP/1.1" 200 26307 "http://www.google.com/search?q=computers+internet+blog" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)"
64.22.110.34 - - [10/Oct/2007:05:34:17 -0700] "POST /wp-comments-post.php HTTP/1.1" 200 84 "https://tim.cexx.org/?p=365" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)"
[...]
75.126.132.23 - - [10/Oct/2007:13:54:26 -0700] "GET /?page_id=342 HTTP/1.1" 200 36704 "http://www.google.com/search?q=computers+internet+blog" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)"
75.126.132.23 - - [10/Oct/2007:13:54:27 -0700] "POST /wp-comments-post.php HTTP/1.1" 200 84 "https://tim.cexx.org/?page_id=342" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)"
[...]
Notice a pattern? These blog-seekers must not be happy that this isn’t the page they were looking for, because they sure seem eager to post a comment. In fact, they dive immediately for wp-comments-post.php within a second of page load, a superhuman feat if you ask me. (As I write this, WordPress cheerfully reports that it took 0.855 wall seconds on my shared server to generate one of those pages.)
Okay, so it’s some stupid comment-spam script leaving its calling card. What to do, what to do? Well, mod_rewrite is installed by default on many servers, and can be used to give different users different pages, depending on such things as their HTTP_REFERER. I recommend the output of /dev/urandom, or better yet, a redirect to a hefty Microsoft Service Pack download (assuming the bots support redirect).
(Obviously, I can’t implement it here now, because I’ve just posted about the ever-mysterious “internet blog” and drawn legitimate search traffic.)
mod_rewrite examples: Just add this to your .htaccess file (create if necessary)
# Plonk this stupid bot with a 403 Forbidden: RewriteEngine on |
Or, if you prefer dickhead A to go and waste the bandwidth of dickhead B…
RewriteEngine on RewriteCond %{HTTP_REFERER} search\?q\=computers\+internet\+blog$ RewriteRule (.*)$ http://www.example.com/rubbish.iso [R,L] |
In both cases, the RewriteCond line checks for the fairly malformed Google URL used by the spammer script (a real Google search query will have other junk before the “q=…”, so we let them slide), and the RewriteRule sends it packing.
Have fun!
Leave a Reply