Blog

Drupal 6: The Performance and Developer Drupal release?

[The information in this post has been greatly updated - basically, Drupal 6 will enjoy even more of a performance advantage over Drupal 5 than is indicated here]

The more one learns about Drupal 6 the more there is to like. A while ago, right before Drupal 5 was released, I heard someone reference it being a 'developer's release' (think maybe it was Dries). That seems like an apt way to describe Drupal 6 these days. If the anti-CMS, ya-gotta-roll-your-own-or-it-stinks critics have a last breath left, this might be what snuffs it out once and for all. The list of new features, optimizations, and/or technologies introduced in Drupal 6 seems destined to light up the antennae of developers everywhere.

Performance, performance, performance

If there is any doubt about whether there will be performance improvements for Drupal 6, wonder no longer. Benchmarks, made using these standards, shows that the current, pre-beta, version of Drupal 6 performs 19.5% faster than Drupal 5.2 does when NO page caching is on. A healthy improvement by any standard, but even more so when one considers the following:

  • Drupal page caching is never active for logged in users, so any gains to non-cached speeds are pure gains for everyone that is logged in. This is a big deal when you consider that up till now it has taken 10x longer to serve authenticated/logged users than it does to server anonymous site visitors.
  • With the block cache patch which looks destined for core applied, Drupal 6 becomes 32% faster than Drupal 5.2 for authenticated users. Community site system admins rejoice.

Without quibbling over the details of specific numbers and/or benchmarking methods - one thing is very clear.

Speed is coming to Drupal.

  • Download benchmarks for Drupal 6, which includes benchmarks for block caching here.
  • Download benchmarks for Drupal 5-2 here.

10 August, 2007

High traffic: Serving 500,000+ unique visitors in 24 hours and over a 9,000,000 a month

Recently a client we helped get started with Drupal and admin their backend systems for, had their first 500,000+ unique-visitors day.

Everything went well, the lessons we've learned over the years of Drupal performance tuning (Part I, Part II), combined with well planned Apache/MySQL/PHP settings provided an event-free day (other than watching the hit counter go through the roof!).

Showing that it was no fluke, the site has had numerous 500,000+ days since then and has settled into an average of 150,000-200,000 unique views a day / 6,000,000-9,000,000 visitors a month.

If you have a Drupal site that you need to prepare for high traffic and high availability, give us a shout. We can help you handle it.

12 March, 2007

Prepare your Drupal site to be Slashdotted, Dugg, and Farked - Part II

Published in: 

It's been a couple weeks since we posted part one of our look at optimizing a Drupal site to withstand large amounts of traffic,
and since that time it happened again - a site we host, got "Farked" (an inbound link from Fark.com) even bigger than it did last time. In the 8 short hours since the link to the client's site went up, and as I write this - the site has received 27,000 + unique viewers. When I logged in there to the site there actually were 1850 users online at the same time.

We just about fell out of our chair when we saw that...

...after all this is a site is that's on a shared server - not a dedicated one. And those kind of numbers would even give some dedicated servers a thorough workout. In the meantime, it was operation 911. Forget the long term issue of finding a larger server space for this site which clearly is outgrowing it's enviroment - that could be handled afterwards. Right now, we had to get the server and site back pronto.

So quickly we posted a "We'll be back soon" message to the deluged site so that the rest of the sites on the server would work again, and then set out to get Farked site back up. Now 27,000 visitors in 8 hours would mean over 75,000 in 24. There's no shared hosting enviroment we know of that can handle a dynamic, php-driven, website with kind of load...so what to do...what to do.

Well, in short order we made an html file out of the page which Fark.com linked to and put a 301 redirect in the site .htaccess file so that anyone visiting that page would get redirected to the much lighter html file and thereby bypass Drupal and all of the bootstrapping and database overhead. The rest of the pages on the site would function just as they always do, of course. What this did was allow us to take the site live again, serve wayyyy more people than should be possible for on a $20 a month hosting plan, and keep everyone else's site happy and screaming along.

Two other things we did since the time we wrote the first article helped a lot too:

condensed the number of, and compressed, the css files for the site. Simply put, we cut the number of files requested on each page visit by 6 by aggregating all the css into one file. That's a lot of requests when you multiply it by thousands.

Disabled the statistics.module. This is a no-brainer. Apache is logging eveything already and there are more robust tools to process and interpret the logs with than what comes with Drupal (no offense Drupal) - so this is a LOT of overhead that the site/server doesn't need to deal with. Besides the input/output processesing, it also speeds up the database to not have such a larger access log file hangin around.

So there you go - how to make lemonade out of more hits than you should be able to handle. 

15 February, 2007

Blocking referrer spam, mal-bots, and other malicious weasels with htaccess

Anyone who runs a site/server for very long will likely find out about the gruesome underbelly of the online work - spammers. They come in many shapes and sizes (most are bots), and with different purposes each, but they each have this in common - they hurt your site/server and it's available resources.

Below are some things to look out for and some methods to take care of one particular type of spam, referrer spam, which can cripple a site/server in no time. With enough referrer spam you'll have what amounts to a denial of service attack (e.g., so many junk requests that the server can't even tend to the real ones).

Example of how serious this can be
Recently one of the sites we host had a big traffic day thanks to being front paged at Fark.com and Foobies.com. 18,000+ unique visitors in 18 hours. Suffice it to say that put quite a load on the shared environment they were hosted in. Well, guess what - the (unrelated) spam attack the site received a few days later actually created more than twice the load on the server that the huge amounts of legitmate traffic did!

Identifying the problem

The first step in fixing a problem is, of course, to know you have one! Referrer spam can be tricky because without knowing where to look you may never realize what is happening in the dark corners of your webserver - you'll just see the symptoms. (a slow site or one that is down completely)

Where to look
If you've got performance issues with your site that you can't tie to an increase in visits then it might be worth a look. The places where you can track referrer spam are a) in your server logs, b) in your site/cpanel statistics pages.

What you'll want to look at is your most recent hits, and the most frequently requested pages. If you see something that surprises you (e.g., an invalid url, or a url that you don't think should be that busy) then note the ip address(es) and/or domain(s) of the who is requesting it. If you ever see pages continually requested by only one ip address/domain or numerous ip's within the same range, then that's not a good sign. Grab the ip address and do a whois lookup on it and try and find out more. There are certain countries, for instance, where spam often orignates from.

Block that spammer
Ok, so now you sure. Your site is being taken apart by a rougue bot. You've identified a fixed ip or defined range of ip that it's coming from. Now it's time to block this vermin using a little .htaccess magic:

To block a single ip address:
(substituting the real ip for the placeholders x's, of course):

order allow,deny
deny from xxx.xxx.xx.x
allow from all

10 February, 2007

Prepare your Drupal site to be Slashdotted, Dugg, and Farked

Slashdotted, Dugg, Farked. These are all terms that site operators, bloggers, and web developers are very familiar with. They imply having a site 'front paged' at a website that drives a LOT of traffic to your own site.

Over the past week one of the sites we host, ended up on the front page of Fark.com and Foobies.com at the same exact time. It added up to some very busy days for a site which is hosted in a shared environment (meaning that it has to share resources of a server with other sites) as well as some useful knowledge concerning:

  • what kind of load a Drupal powered site can handle when in a shared enviroment
  • how to optimize Drupal's capability to handle a large number of visitors

To begin, it need to be understood that overall optimization for site traffic is going to depend on a gazillion different factors. If you don't have a reliable server stack which is already optimimized this article will only do you so much good. Apache, MySQL, and PHP need to be running reliably, and well tuned.

Assuming you have a well tuned server, then how much traffic your Drupal powered site can handle will depend on:

The amount of resources it has available (cpu and memory particullarly)
If your site is on a fully dedicated server that has 4GB's of ram and 4 CPU's, it's obviously going to make a tremendous difference in what the site can handle, in comparison to a site which exists in a shared enviroment and only gets a fraction of those resources to use. This is common sense, of course. Eventually, if your server stack is fully optimized and your Drupal installation is fully optimized and your site still can't handle the load then mo' better hardmare is your only long term choice.

How many features are enabled on the site, and which ones

One of the rather fun aspects of watching the site receive so much traffic was having a chance to test real world cause and effect with a number of Drupal/site features. Some of them make a very big difference in how much work needs to be done to generate a page view, and thereby how many people the site and server can reliably and consistently handle.

10 February, 2007

Drupal Intranet - Controlling access by role, per node, per user

Recently we had the pleasure of developing a very cool intranet for a group associated with the United Nations. They desired an online space within which they can privately share articles, comments, and files with each other.

Our mission was to make a site that would:

  • Not let anonymous users view any content
  • Enable varying levels of viewing, adding, and editing rights across differing authenticated user roles - on a per page/node basis
  • Enable different/custom menu configurations based upon user role
  • Redirect users after a successful login attempt to a front page which is unique to user role

If you have never used Drupal before you may not know that the above functionality is not available out-of-the-box. However, with a little research we found some contributed modules which helped us to achieve a totally customizable intranet:

  • front page
  • login destination
  • menu per role
  • nodeaccess
8 February, 2007

Pages

Subscribe to RSS - blogs