performance

Example Varnish VCL for a Drupal / Pressflow site

A few months ago I set up Varnish on my Macbook Pro and have deployed it for a production site which serves anonymous and (a lot of) authenticated users. Initially, I spent a couple months just running it in my local environment, including backporting the Varnish.module to Drupal 5. In retrospect, I'm glad that I spent the time to learn how Varnish and it's configuration file works before deploying it, as it's paid off in a big way as our production site now has something which is equivalent to:

  • ...an in-memory static file server for all users (e.g., the equivalent of hooking up something like nginx or lighttpd as a front end to Apache (or whatever you're using).
  • ...an in-memory boost.module in terms of database-relief for anonymous users.

Contrary to popular belief the two items above are in no way an automatic benefit of simply installing Varnish. If the configuration file, and Drupal installation, is not massaged with care one definitely won't get the database relief from anonymous page caching, and the benefits from Varnish-as-a-static-file server will not nearly be optimized. Bottom line Varnish can be a temperamental piece of software. It only gives back what you put into it.

To this end, the settings in the Varnish VCL file can make or break whether you get a substantial benefit from it. Below is an example VCL file, which was formed from a good amount of research and a lot of trial and error:

18 May, 2010

Scaling Drupal: HTTP pipelining and benchmarking revisited

UPDATE: I've updated some of the numbers below to reflect corrections for a testing error. Let's just say to be sure not to benchmark with any external links in your test pages (because if you do use external links you'll obviously be benchmarking the external server too, which is not what we want in this case). To summarize the effect of these corrections - having lighttpd in front of Apache and pipelining actually provide a substantially larger boost in performance than I had indicated before. Other than that the results are the same.

So things with my first attempt at benchmarking HTTP pipelining did not go exactly as planned. It turns out that if two different domains/subdomains you are using for content on your site are pointing to the same IP, based on previous testing, it looks like browsers (at least FireFox) will not pipeline requests (e.g., create more concurrent requests to your site) because it considers the requests as being from the same origin. In order for a browser to pipelining requests at all, they seem to require two domains/subdomains which are using two separate/unique IPs. If you read the Wikipedia entry for hostnames this all makes sense, as it indicates domains are associated with IP's, and browserscope's testing of browsers checks for "Connections per Hostname", not "Connections per Domain".

After figuring out how to get requests to pipeline correctly, I re-benchmarked all the configurations from the first article . Everything from that article regarding lighttpd is still holds true, so without covering those aspects again, here's the updated benchmarks and notes for browser request pipelining:

  • Once the conditions for request pipelining was setup correctly there were discernable performance implications. Some of them I definitely wasn't expecting. On the one end of the spectrum, with browser pipelining working (via string replacement of domains within the rendered HTML) and lighttpd serving the static files there was an 11% increase in throughput vs not using the pipelining methods. So static file serving ='s good, and static file serving + HTTP pipelining ='s a little better.

    This is not where the story ends with pipelining however, as there was a net performance decrease by enabling pipelining with all configurations which did not use a separate static file server! (in my case lighttpd on the same machine)

27 January, 2010

Scaling Drupal: Benchmarking static file serving with lighttpd and browser pipelining

I finally had a chance to investigate an optimization which I've been wondering about for a while now - serving static files of a site from somewhere else. As a side, but related, experiment I also tested the claim that serving files from a static file server/separate domain/subdomain will speed things up because it results in browsers opening more concurrent requests than they would from a single domain.

For my tests I used lighttpd (pron. lighty) as a static file server for Apache. The idea is that lighttpd, which is acclaimed as being fast and light on memory, will serve the non-dynamic pages of the site (images, CSS, Javascript, etc), which should thereby help relieve Apache of some of its workload. This arrangement involves changing the paths, either on the backend or frontend, to these static resources so that they no longer get served by Apache.

The pieces
All tests took place on my Macbook Pro and involved two pages on a large Drupal 5 site running Pressflow. For the static file server itself, I installed lighttpd using Macports. Two separate pages of the site were tested, the smaller page's number of static files was in the category of 'average' for most sites. The larger page of the two, was very large - 39 CSS files, 23 Javascript files, and 46 image files.

Methods tested and benchmarked
I implemented and benchmarked the following methods of path modification in order to enable static file serving:

25 January, 2010

Convert your MySQL database from MyISAM to InnoDB, and get ready for Drupal 7 at the same time

If you haven't already heard, Drupal 7 will default to using the InnoDB storage engine instead of MyISAM for MySQL (though a MyISAM database will continue to work just fine in Drupal 7). This is fairly substantial change within Drupal core, and as the thread in the issue queue I linked to shows, there were a lot of questions and apprehension about it. However...

...we are going to just skip over a lot of that apprehension and get down to point of this article - there's no good reason not to hop right into using InnoDB today on your Drupal 5 or Drupal 6 site. The rewards are; a possibly significant improvement in performance, a definite improvement in scalability (most highly trafficked Drupal sites have been using InnoDB for some time now because of this), and you'll start getting used to working with what will be more and more common in your Drupal-life, InnoDB.

My experience
I came to the conclusion about how great InnoDB is after researching the experiences of others, and after converting a large Pressflow-driven Drupal 5 site from InnoDB vs MyISAM. This change resulted in a 14% increased throughput during load tests performed in JMeter. That's a very substantial increase, and while everyone's mileage will vary based on their own site, server, and any number of variables it's clear enough to me that there's nothing to be afraid of as far as InnoDB goes (quite the contrary).

Converting your database to InnoDB
Before you go any further backup your database before doing any steps below. If you 'splode your database for any reason, you'll need it.

Here are the steps:

1. Shutdown MySQL

2. Move/copy/change the name of ib_logfile0 and ib_logfile1 files. (find where MySQL exists on your system - locations can vary greatly). MySQL will recreate these files when you start it up again. Not anytime you change the innodb_log_file_size parameter you will need to shutdown MySQL, move these files, and start up MySQL again.

3. Tune it up a bit
Based on a lot of searching around and benchmarking with JMeter I arrived at the setting below for running on my Macbook Pro. See the links at the end of this post for articles which can help you determine what to adjust these numbers to for other machines (ones with more RAM/CPU, for instance. The production server for this particular site ended up with 5000M setting for innodb_buffer_pool_size. So settings will, and should, vary greatly just depending).

18 January, 2010

Speeding up svn / ssh transfer speed in terminal for Mac OS 10.5, Leopard

Published in: 

After struggling with upload/commit transfer speeds that were absolutely crippling, I've finally managed to get an exponential speed boost for svn merging and svn committing large numbers of files.

For svn merging, see this post by the good folks at WorkHabit.org.

For general SVN transfer speed help - hop into your /etc/ssh_config file and add these lines to your file:

# Host *
Compression no
FallBackToRsh yes
KeepAlive yes

Enjoy not seeing upload times of 2k/sec anymore. :-)

1 July, 2008

Speeding up Drupal Forums

The Drupal forum.module has become, well, somewhat infamous for less than awesome scalability. Recently I had a chance to see this firsthand, and track down a solution for managing the long page load times for a client who has a highly trafficked forum. This was not a case of a site that was un-tuned - actually this particular site had a lot of good work and performance enhancements already done to it, including block caching and even some modifications to the forum module that were allowing to work better than it would have without them. But still 5-6 second page load times on /forum persisted.

As this was my first time working on the site, I began by reviewing all of the main configuration files for Apache, MySQL, and PHP, since they are the foundation for everything else. After making some adjustment there, I headed over to the Drupal admin interface and check /admin/setting/performance/ and made sure all was happy there as well. Finally, I went to the block admin page and double-checked all of the blockcache settings, which as it turns out were set a bit to aggressively, resulting in slow form submission times (every time anyone submitted anything a gazillion blocks were being re-cached whether they need to be or not).

With the foundation of the site now looking good and everything, except the forum pages flying along (and the tracker, but that's a story for a later article) - it was just down to forum.module.

Here are the steps that led to cutting the page load time in half from what they were:

1. Disable the forum.module, rename it to something like "forum.module.orig", make a copy and then rename the copy "forum.module".

2. Download the advcache module and apply the forum patch includes to your new copy of the forum.module.

3. If want the cleanest solution and are comfortable with coding/debugging at all, instead of just copying the forum module and working on it directly (and thus having a hacked Drupal core file around all the time) - name the copied file something completely different than forum.module and edit all of the hook/function calls with in it with new name and place it where you keep the rest of your contributed modules.

4. (Note: this step is an option only if all your forums are public)
Open the module and remove all references to db_rewrite_sql. This will keep Drupal from doing a lot of expensive and uneccesary queries in order to check access rights. (thanks Khalid)

28 February, 2008

Pages

Subscribe to RSS - performance