Prepare your Drupal site to be Slashdotted, Dugg, and Farked - Part II

Published in: 

It's been a couple weeks since we posted part one of our look at optimizing a Drupal site to withstand large amounts of traffic,
and since that time it happened again - a site we host, got "Farked" (an inbound link from Fark.com) even bigger than it did last time. In the 8 short hours since the link to the client's site went up, and as I write this - the site has received 27,000 + unique viewers. When I logged in there to the site there actually were 1850 users online at the same time.

We just about fell out of our chair when we saw that...

...after all this is a site is that's on a shared server - not a dedicated one. And those kind of numbers would even give some dedicated servers a thorough workout. In the meantime, it was operation 911. Forget the long term issue of finding a larger server space for this site which clearly is outgrowing it's enviroment - that could be handled afterwards. Right now, we had to get the server and site back pronto.

So quickly we posted a "We'll be back soon" message to the deluged site so that the rest of the sites on the server would work again, and then set out to get Farked site back up. Now 27,000 visitors in 8 hours would mean over 75,000 in 24. There's no shared hosting enviroment we know of that can handle a dynamic, php-driven, website with kind of load...so what to do...what to do.

Well, in short order we made an html file out of the page which Fark.com linked to and put a 301 redirect in the site .htaccess file so that anyone visiting that page would get redirected to the much lighter html file and thereby bypass Drupal and all of the bootstrapping and database overhead. The rest of the pages on the site would function just as they always do, of course. What this did was allow us to take the site live again, serve wayyyy more people than should be possible for on a $20 a month hosting plan, and keep everyone else's site happy and screaming along.

Two other things we did since the time we wrote the first article helped a lot too:

condensed the number of, and compressed, the css files for the site. Simply put, we cut the number of files requested on each page visit by 6 by aggregating all the css into one file. That's a lot of requests when you multiply it by thousands.

Disabled the statistics.module. This is a no-brainer. Apache is logging eveything already and there are more robust tools to process and interpret the logs with than what comes with Drupal (no offense Drupal) - so this is a LOT of overhead that the site/server doesn't need to deal with. Besides the input/output processesing, it also speeds up the database to not have such a larger access log file hangin around.

So there you go - how to make lemonade out of more hits than you should be able to handle. 

Related article:
High traffic: Serving 100,000+ unique visitors in 24 hours with Drupal

15 February, 2007

Comments

1850 users online at the same time?

i host my domain by bluehost.com my site based on drupal. if over 100 visitors online at the same time, i can not open (connect to) my site. I ask myself: which happens, if there is 1800 :(

This article is great. Only other thing I would have liked to know is what modules you throttled and what modules you have installed. Thanks for the info.

Including vote_up_down, subscriptions, the forward modules, views, and a few others I'm probably not remembering right now. Basically, when it's time to throttle we just went for the modules/blocks that were not mission-critical (e.g., the statistics module, forward, subscriptions, etc).

Good idea to great a static HTML page for the high traffic page. I guess in the long term you could use the boost module to create static html pages for your entire site. That is something I have considered, but haven't done yet.

We recently had a site receive a huge amount of traffic. Most of the users were writting to the database as it was a vote. MySQL kept falling over due to table locking. Nightmere times.

so what did you have to say about the ones you throttled and what modules you have installed? did i miss it somewhere?

one of my sites was mentioned on a national radio show while i was on a shared host and it stood up pretty-well (~40,000 PVs over just an hour or two). I think it was the caching that saved the day.
Since then that same site has been on the front page of Digg after i had moved it to a dedi and it didnt hold up as well.
Redirected to a static HTML page is a great idea... just a bit too late for me..

I hear the throttle option was left out of the new version. wonder why?