Drupal/server optimization may matter little if you've got the leeches

During the course of administering a server full of various sites which have been Farked, Dugg, and StumbledUpon'd, I've learned first hand the value of optimizing a Drupal site/server to handle large amounts of traffic. I've also learned that eventually it's likely that the level of optimization for one's Drupal site/server will be rendered mostly irrelevant by frequent, and (mostly) malicious, circumstances.

Malicious and/or misdirected requests
Compared to more popular subjects such as Drupal optimization, Apache, MySQL, and/or PHP optimization - the subject of malicious requests gets a rare mention. Despite losing the popularity contest to those sexier subjects, rest assured, there is a lot more to running a site than just tuning the former items. All of those things can be working really, really well and your site/server can still be hammered to a state of dysfunction - even with very few users coming through the site.

Welcome to the wonderful world of denial of service attacks and/or server spam. Broadly defined this includes anything that is requesting something from your server that is of a malicious (usually the case) and/or misguided origin. Make no mistake it's a problem which anyone who is running a medium to large size web site will contend with, whether they know it or not.

Server spam/DDS attacks are NOT uncommon problems limited only to large or 'unlucky' websites. If you have a server and/or VPS with a frequently trafficked site, or especially one with several frequently trafficked sites you will be amazed at just how much processor, memory, and bandwidth can be allocated to malicious and/or wrongly directed requests at any given point in time. The exact of amount of resources that these leeches suck up varies greatly depending on many factors, but on the high-end I can share that several times over the last 6 months alone I've personally witnessed, and fixed, crippling issues stemming from server spam for sites that I system admin for. Likewise several months ago I was happy to be able to help resurrect Drupal.org one day when it was suffering from a particularly nasty malbot. (How many people has this happened to that never realize what was going on, and who in turn just blamed their host and/or Drupal??)

So once one realizes that server spam is a real, and not theoretical, problem how does one confirm the issue(s) on their own site/server and what can they do about it?

23 September, 2007

Komodo IDE and Drupal/PHP development - a combo built upon mutual appreciation

After spending 3 days trying to get Elipse PDT and the Zend debugger working on Mac OS X, my nerves were very frayed, indeed. Apparently, there has been an ongoing problem with the Zend debugger not stopping at breakpoints on Mac Intel machines...something that has plagued Eclipse through 3 different PHP extensions. (don't even get me started on how crazy it is that Eclipse has seen three completely separate PHP plugins within less than a year)

In the end all was not lost. On the contrary, after enough scratching around Google I discovered what I needed to know. Komodo has become the semi-de-facto IDE of choice for many Drupal developers. A fact confirmed for me when I saw many familiar names from the Drupal community on the ActiveState website (itself a Drupal site).

Suffice to say with Komodo I got local and remote debugging up and running within just a couple hours, and it's been a total dream to use.

So now I have a proper debugger, integrated svn, and last but not least, Drupal api code compeletion/documentation is included. And if that's not enough, it uses 100mb+ less memory than Eclipse did.

Komodo is going to set me back a few bucks ($295) after my trial runs out, but even at that price it's a no-brainer for some who makes their living with Drupal and PHP. Kudos to ActiveState on their outreach to the Drupal community.

Some useful links if your interested in checking out Komodo:
Link 1
Link 2
Link 3
Link 4

13 September, 2007

AutoPilot released - Change management just got a whole lot easier

Perhaps, one of the most important contributed Drupal projects ever released, was released today (at least as far as professional Drupal development goes).

AutoPilot is out and for anyone who has a Drupal site for which they need to worry about syncing development, staging, and or production versions - it's magnificent. I've had privilege of seeing it work in person - and it's unlike anything available to Drupal developers before now.

Though I haven't set up my own local instance of this yet - the version I saw allowed one to be able to simply click a button in order to update changes between a development environment and a staging/production environment.

Congratulations, WorkHabit on such a valuable contribution, and for everyone else - grab this at your earliest convenience and let's work to integrate this as a core part of a Drupal development toolbox, asap. (ala the devel.module)

[Please direct any questions about AutoPilot to WorkHabit and/or the AutoPilot project on Drupal.org.]

21 August, 2007

Drupal 6: Benchmarking and Block Cache Performance Revisited

As a follow up to an earlier article I posted about Drupal 6 performance, and please bear with my learning curve for a moment, I figured out by 'accident', and a lot of investigation, that it matters very much the order one uses when they 'generate content' with the devel module for benchmarking purposes. My previous tests were done incorrectly - I inadvertently created a bunch of nodes that weren't assigned to any terms or users and vice versa. The result of correcting this error means that a no-cache-enabled-baseline takes much longer to complete than when I had things setup incorrectly.

...happily, the point of this article isn't that I'm a total goof.

No, the good news out of this ordeal is that now when block-cache-disabled performance is compared to block-cache-enabled performance the results are MUCH more substantial than previously noted (and thus Drupal 6 is going to be that much faster than it's predecessor Drupal 5 for authenticated users):

2489.69 ms (request time for auth user, no-caching of any kind)
-878.09 ms (request time for auth user, block-caching on)
1,611.6 (difference) / 2489.69 =
64.73% improvement w/ block cache on

With the block caching on, the mean processing time is 876 ms with a sd of 91.9 ms while the base install results in 2481 ms mean processing time and sd of 91.9. Even at the upper end of the standard deviation, the block-cached processing time is 967.9 ms, which is far below the low end of the standard deviation (2080.1 ms) for the non-block-cached test. Looks like a clear improvement - 64.7 percent improvment using just the means.

The benchmarks are posted here so that everyone can do their own math. If you'd like to check the validity of my installation/numbers - feel free to download a tarball which includes all the files and a db dump. Username/pass for user 1 = superadmin


Benchmarks using 10,000 nodes, 5000 comments, 15 categories, 250 terms, 2000 users and with the following blocks enabled:

Recent comments 
User login
Active forum topics
New forum topics
Who's online


13 August, 2007

Drupal 6: The Performance and Developer Drupal release?

[The information in this post has been greatly updated - basically, Drupal 6 will enjoy even more of a performance advantage over Drupal 5 than is indicated here]

The more one learns about Drupal 6 the more there is to like. A while ago, right before Drupal 5 was released, I heard someone reference it being a 'developer's release' (think maybe it was Dries). That seems like an apt way to describe Drupal 6 these days. If the anti-CMS, ya-gotta-roll-your-own-or-it-stinks critics have a last breath left, this might be what snuffs it out once and for all. The list of new features, optimizations, and/or technologies introduced in Drupal 6 seems destined to light up the antennae of developers everywhere.

Performance, performance, performance

If there is any doubt about whether there will be performance improvements for Drupal 6, wonder no longer. Benchmarks, made using these standards, shows that the current, pre-beta, version of Drupal 6 performs 19.5% faster than Drupal 5.2 does when NO page caching is on. A healthy improvement by any standard, but even more so when one considers the following:

  • Drupal page caching is never active for logged in users, so any gains to non-cached speeds are pure gains for everyone that is logged in. This is a big deal when you consider that up till now it has taken 10x longer to serve authenticated/logged users than it does to server anonymous site visitors.
  • With the block cache patch which looks destined for core applied, Drupal 6 becomes 32% faster than Drupal 5.2 for authenticated users. Community site system admins rejoice.

Without quibbling over the details of specific numbers and/or benchmarking methods - one thing is very clear.

Speed is coming to Drupal.

  • Download benchmarks for Drupal 6, which includes benchmarks for block caching here.
  • Download benchmarks for Drupal 5-2 here.

10 August, 2007

Prepare your Drupal site to be Slashdotted, Dugg, and Farked - Part II

Published in: 

It's been a couple weeks since we posted part one of our look at optimizing a Drupal site to withstand large amounts of traffic,
and since that time it happened again - a site we host, got "Farked" (an inbound link from Fark.com) even bigger than it did last time. In the 8 short hours since the link to the client's site went up, and as I write this - the site has received 27,000 + unique viewers. When I logged in there to the site there actually were 1850 users online at the same time.

We just about fell out of our chair when we saw that...

...after all this is a site is that's on a shared server - not a dedicated one. And those kind of numbers would even give some dedicated servers a thorough workout. In the meantime, it was operation 911. Forget the long term issue of finding a larger server space for this site which clearly is outgrowing it's enviroment - that could be handled afterwards. Right now, we had to get the server and site back pronto.

So quickly we posted a "We'll be back soon" message to the deluged site so that the rest of the sites on the server would work again, and then set out to get Farked site back up. Now 27,000 visitors in 8 hours would mean over 75,000 in 24. There's no shared hosting enviroment we know of that can handle a dynamic, php-driven, website with kind of load...so what to do...what to do.

Well, in short order we made an html file out of the page which Fark.com linked to and put a 301 redirect in the site .htaccess file so that anyone visiting that page would get redirected to the much lighter html file and thereby bypass Drupal and all of the bootstrapping and database overhead. The rest of the pages on the site would function just as they always do, of course. What this did was allow us to take the site live again, serve wayyyy more people than should be possible for on a $20 a month hosting plan, and keep everyone else's site happy and screaming along.

Two other things we did since the time we wrote the first article helped a lot too:

condensed the number of, and compressed, the css files for the site. Simply put, we cut the number of files requested on each page visit by 6 by aggregating all the css into one file. That's a lot of requests when you multiply it by thousands.

Disabled the statistics.module. This is a no-brainer. Apache is logging eveything already and there are more robust tools to process and interpret the logs with than what comes with Drupal (no offense Drupal) - so this is a LOT of overhead that the site/server doesn't need to deal with. Besides the input/output processesing, it also speeds up the database to not have such a larger access log file hangin around.

So there you go - how to make lemonade out of more hits than you should be able to handle. 

15 February, 2007


Subscribe to RSS - Drupal