Advanced server/spam bot blocking

Published in: 

As promised in an earlier article about blocking server spam, here are some advanced tips on shutting the door to these resource leeches:

#1: Non-existent urls getting hammered:
This is can be a major problem, one which I believe has been at least somewhat cured in Drupal 6, but for Drupal 5 and below a request to a non-existent page such as http://yoururl.com/node/vote/ does not trigger a 404 page as you might expect. Instead the entire front page loads up. Annoying enough as it is, but when combined with a confused/malicious bot that continually hammers the non-existent url, the resource load can be enough to weigh heavily even on dedicated server, let alone a shared-hosting account. [note: there is an update in the comments below with more specific information about the versions of Drupal which are affected by this problem]

What to do about it:
Certainly putting any paths you see that are getting hit this way in your robots.txt file is a good idea, but that does not always solve the problem. Sometimes more drastic measures are needed. Below is snippet from an actual .htaccess file that has on several cured malbot instances that were causing significant server slowdowns - feel free to use and append yours appropriately (be sure you do not have a line break before the [OR] if you copy this - and also be sure your last line does not have an [OR]):

### Forbid access to bot-beaten non-pages
RewriteCond %{REQUEST_URI} ^/node/forward($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/blog/comment($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/blog/node/forward($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/blog/blog($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/storylink/forward($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/node/blog($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/_vti_bin/404.html($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/categories/node($|/.*$) [OR]
RewriteCond %{REQUEST_URI} ^/node/accessories($|/.*$)
RewriteRule .* - [F,L]

#2: Have an IP? Awesome. Now keep bots from even reaching apache
If there is anything good that can be said about server malbots, as compared to their comment spamming cousins, it's that typically a server-spamming bot will have a static ip address instead of a (dreaded) dynamic one. This makes banning it much easier, of course.

Naturally one can just head to their .htaccess file and ban an ip from a site from there, but sometimes this is a non-optimal situation. For instance in the case of a server full of sites getting hit from the same ip, updating all of the htaccess files is a pain. Additionally, banning from .htaccess also means that apache is still involved with the request - even if minimally. For a truly strong surge of server spam (e.g., a denial of service attack) even that 'little bit' of bootstrapping apache can still be quite a load. Better to keep the traffic from those ips off the entire server, and before apache gets involved. :-)

What to do about it:
Enter APF (advanced firewall policy):

"APF is a policy based iptables firewall system designed for ease of use and configuration. It employs a subset of features to satisfy the veteran Linux user and the novice alike. Packaged in tar.gz format and RPM formats, make APF ideal for deployment in many server environments based on Linux."

If your are leasing your server from a hosting company they may have already installed it for you (the path to to apf on my version of CentOS is /etc/apf). Assuming you have apf up and running, here's all you need to do to keep those bad bots away from all the sites on your server and apache:

1. Open the file deny_hosts.rules
2. Add a line that looks somewhat similar to this (be sure there is a line break between the comment line and the line that the ip address is on):

# added xxx.xx.xxx.xx on 10/16/07 because it's a crummy MALBOT
217.212.224.0/16

You can even do this to add a range of ip's:

# added xxx.xx.xxx.xx on 10/16/07 because it's a crummy MALBOT
217.212.224.0/16

3. restart apf /etc/init.d/apf restart

4. Enjoy

16 October, 2007

Comments

you do not want to stay on 5.1 nor do you want to advertise that your site runs 5.1. 5.2 fixes security problems in 5.1. it isn't just frills. you are at risk of getting hacked.

I got kind of stuck into having to disclose for the sake of the discussion/post...

Will have to upgrade soon. In the meantime, thank goodness for consistent and frequent backups. :-)

[Update: Have upgraded the site to 5.2 -- yay, or at least 'yay' until 5.3 comes out and I'm concerned about security updates all over again ;-) ]

[Update #2, less than 24 hours later: Yarrrrgh!!! Have now updated to the just-released 5.3....talk about Murphy's law. Grr.]

fwbuilder, to be found at fwbuilder.org is an awesome firewall script generator. will build for iptables, pf, cisco pix, and lots of others. Nice GUI, very nice featureset.

I've found this to be an endless and futile process. What generally happens is the admin gets so frustrated that he goes overboard and ends up blocking valid users.

There's no simple solution that I'm aware of.

Certainly hoping to block ALL spam coming to a server is an impossible task, but getting the biggest offenders has been necessary several times -- the server spam was actually causing the server to be unavailable and/or extremely slow. In fact, I helped diagnose such an issue for Drupal.org one day. The site was being bombarded with requests from this one bot and it was bringing everything down. Bot banned, site back up. :-)

I think that it is impossible to guqarantee 100% spam protection. There is no such tool.

Router based security for DOS attacks is the only workable solution, but at a huge cost of course.