Prepare your Drupal site to be Slashdotted, Dugg, and Farked

Slashdotted, Dugg, Farked. These are all terms that site operators, bloggers, and web developers are very familiar with. They imply having a site 'front paged' at a website that drives a LOT of traffic to your own site.

Over the past week one of the sites we host, ended up on the front page of Fark.com and Foobies.com at the same exact time. It added up to some very busy days for a site which is hosted in a shared environment (meaning that it has to share resources of a server with other sites) as well as some useful knowledge concerning:

  • what kind of load a Drupal powered site can handle when in a shared enviroment
  • how to optimize Drupal's capability to handle a large number of visitors

To begin, it need to be understood that overall optimization for site traffic is going to depend on a gazillion different factors. If you don't have a reliable server stack which is already optimimized this article will only do you so much good. Apache, MySQL, and PHP need to be running reliably, and well tuned.

Assuming you have a well tuned server, then how much traffic your Drupal powered site can handle will depend on:

The amount of resources it has available (cpu and memory particullarly)
If your site is on a fully dedicated server that has 4GB's of ram and 4 CPU's, it's obviously going to make a tremendous difference in what the site can handle, in comparison to a site which exists in a shared enviroment and only gets a fraction of those resources to use. This is common sense, of course. Eventually, if your server stack is fully optimized and your Drupal installation is fully optimized and your site still can't handle the load then mo' better hardmare is your only long term choice.

How many features are enabled on the site, and which ones

One of the rather fun aspects of watching the site receive so much traffic was having a chance to test real world cause and effect with a number of Drupal/site features. Some of them make a very big difference in how much work needs to be done to generate a page view, and thereby how many people the site and server can reliably and consistently handle.

We became good friends with the throttle.module which in all honesty is the single thing that made it possible to server 18,000+ unique users in about an 18 hour time frame. In the initial influx of traffic the site actually went down, but we quickly turned on the throttle module and the "who's online" block so that we could monitor the number of people online at the same time (at it's height there were around 450). With a watchful eye we were able to finally narrow down which block/modules needed to be throttled the most and which ones we could get away without worrying about, as well as exactly when we needed to have the throttle module kick in (as expressed in # of users on the site). Now if this happens again, we won't have to worry about it. The throttle module will automatically kick in at a comfortable level and the site won't go down again.

Cache
You need to have your cache on at all times, except when editing CSS files. Period. And go get the blockcache module, too. A non-caching Drupal site with a lot features doesn't stand a chance in the face of heavy, heavy traffic and will use many more resources to do the same amount of work that a cached site will.

Error handling
Be sure that your 404 and 403 errors are being handled in a sparing way. Having to load a large page for these errors not only leaves you open to denial of service attacks, but it also takes away resources that could be used for server pages that are there. Drupal 5 has optimized error handling quite a bit, so if you are running it then you probably have no worries. Sites on earlier versions of Drupal may want to take steps though.

Number of anonymous users vs. authenticated users

This is one variable you may not have a lot of control over. Your site likely is a site that either does, or does not have a high percentage of logged in users hanging around. But - it does matter, a lot so you need to be aware of this variable or else you could end up doing a bit of head scratching. Given that Drupal can server some 50+ anonymous requests for every 1 that it can handle for a logged in user it's easy to see that if your site is full of 5,000 logged-in users that's a whole lot more of a load for the server than even 20,000 anonymous users.

[Update: There is now a part two to this story]

More info:

High traffic: Serving 100,000+ unique visitors in 24 hours with Drupal

Drupal webserver configurations compared

Drupal 5: performance

10 February, 2007

Comments

Could you please point me to the similar content on Wordpress (if there is any), please?

Sorry, we're Drupal specialist - but I'm sure there's some info out there by searching for things like 'Wordpress digg fark' or something like that.

Cheers

Great articles guy, I will jump on the feed. Thanks a lot.

Coming from a WordPress only base of experience, how would you rate the learning curve to Drupal? Are we talking blank sheet, clean slate starting over... or is there some skills, knowledge, experience that I can use?

Thanks!

Coming from wordpress, and assuming you've done some custom stuff with it, should be a pretty large help in some ways, but that said there's a fairly substantial learning curve to Drupal. One way to cut that down a lot is to make sure to get the Drupal Pro Development book, right away so that you have some good documentation by your side.

Regards