Prepare your Drupal site to be Slashdotted, Dugg, and Farked
Slashdotted, Dugg, Farked. These are all terms that site operators, bloggers, and web developers are very familiar with. They imply having a site 'front paged' at a website that drives a LOT of traffic to your own site.
Over the past week one of the sites we host, ended up on the front page of Fark.com and Foobies.com at the same exact time. It added up to some very busy days for a site which is hosted in a shared environment (meaning that it has to share resources of a server with other sites) as well as some useful knowledge concerning:
- what kind of load a Drupal powered site can handle when in a shared enviroment
- how to optimize Drupal's capability to handle a large number of visitors
To begin, it need to be understood that overall optimization for site traffic is going to depend on a gazillion different factors. If you don't have a reliable server stack which is already optimimized this article will only do you so much good. Apache, MySQL, and PHP need to be running reliably, and well tuned.
Assuming you have a well tuned server, then how much traffic your Drupal powered site can handle will depend on:
The amount of resources it has available (cpu and memory particullarly)
If your site is on a fully dedicated server that has 4GB's of ram and 4 CPU's, it's obviously going to make a tremendous difference in what the site can handle, in comparison to a site which exists in a shared enviroment and only gets a fraction of those resources to use. This is common sense, of course. Eventually, if your server stack is fully optimized and your Drupal installation is fully optimized and your site still can't handle the load then mo' better hardmare is your only long term choice.
How many features are enabled on the site, and which ones
One of the rather fun aspects of watching the site receive so much traffic was having a chance to test real world cause and effect with a number of Drupal/site features. Some of them make a very big difference in how much work needs to be done to generate a page view, and thereby how many people the site and server can reliably and consistently handle.
We became good friends with the throttle.module which in all honesty is the single thing that made it possible to server 18,000+ unique users in about an 18 hour time frame. In the initial influx of traffic the site actually went down, but we quickly turned on the throttle module and the "who's online" block so that we could monitor the number of people online at the same time (at it's height there were around 450). With a watchful eye we were able to finally narrow down which block/modules needed to be throttled the most and which ones we could get away without worrying about, as well as exactly when we needed to have the throttle module kick in (as expressed in # of users on the site). Now if this happens again, we won't have to worry about it. The throttle module will automatically kick in at a comfortable level and the site won't go down again.
You need to have your cache on at all times, except when editing CSS files. Period. And go get the blockcache module, too. A non-caching Drupal site with a lot features doesn't stand a chance in the face of heavy, heavy traffic and will use many more resources to do the same amount of work that a cached site will.
Be sure that your 404 and 403 errors are being handled in a sparing way. Having to load a large page for these errors not only leaves you open to denial of service attacks, but it also takes away resources that could be used for server pages that are there. Drupal 5 has optimized error handling quite a bit, so if you are running it then you probably have no worries. Sites on earlier versions of Drupal may want to take steps though.
Number of anonymous users vs. authenticated users
This is one variable you may not have a lot of control over. Your site likely is a site that either does, or does not have a high percentage of logged in users hanging around. But - it does matter, a lot so you need to be aware of this variable or else you could end up doing a bit of head scratching. Given that Drupal can server some 50+ anonymous requests for every 1 that it can handle for a logged in user it's easy to see that if your site is full of 5,000 logged-in users that's a whole lot more of a load for the server than even 20,000 anonymous users.
[Update: There is now a part two to this story]
High traffic: Serving 100,000+ unique visitors in 24 hours with Drupal