Convert your MySQL database from MyISAM to InnoDB, and get ready for Drupal 7 at the same time

If you haven't already heard, Drupal 7 will default to using the InnoDB storage engine instead of MyISAM for MySQL (though a MyISAM database will continue to work just fine in Drupal 7). This is fairly substantial change within Drupal core, and as the thread in the issue queue I linked to shows, there were a lot of questions and apprehension about it. However...

...we are going to just skip over a lot of that apprehension and get down to point of this article - there's no good reason not to hop right into using InnoDB today on your Drupal 5 or Drupal 6 site. The rewards are; a possibly significant improvement in performance, a definite improvement in scalability (most highly trafficked Drupal sites have been using InnoDB for some time now because of this), and you'll start getting used to working with what will be more and more common in your Drupal-life, InnoDB.

My experience
I came to the conclusion about how great InnoDB is after researching the experiences of others, and after converting a large Pressflow-driven Drupal 5 site from InnoDB vs MyISAM. This change resulted in a 14% increased throughput during load tests performed in JMeter. That's a very substantial increase, and while everyone's mileage will vary based on their own site, server, and any number of variables it's clear enough to me that there's nothing to be afraid of as far as InnoDB goes (quite the contrary).

Converting your database to InnoDB
Before you go any further backup your database before doing any steps below. If you 'splode your database for any reason, you'll need it.

Here are the steps:

1. Shutdown MySQL

2. Move/copy/change the name of ib_logfile0 and ib_logfile1 files. (find where MySQL exists on your system - locations can vary greatly). MySQL will recreate these files when you start it up again. Not anytime you change the innodb_log_file_size parameter you will need to shutdown MySQL, move these files, and start up MySQL again.

3. Tune it up a bit
Based on a lot of searching around and benchmarking with JMeter I arrived at the setting below for running on my Macbook Pro. See the links at the end of this post for articles which can help you determine what to adjust these numbers to for other machines (ones with more RAM/CPU, for instance. The production server for this particular site ended up with 5000M setting for innodb_buffer_pool_size. So settings will, and should, vary greatly just depending).

18 January, 2010

Provisioning and install script for a speedy Drupal workflow

I made this script and the database backup, dump, and SVN commit script because I was determined to spend as little time as possible doing sysadmin while setting up dev and staging sites, so that I could spend as much time as possible developing (e.g., the fun stuff). With one command the script can:

  • 'svn up' a version controlled database, and upload it to your database
  • Run queries against database to set preferred site defaults
  • 'svn up' site docroot
  • Copy over fresh "files" directory from another site (e.g., production). Note, not a good option if you have your "files" directory version controlled.
  • Set owner:group file permissions on all site files

How to
The script should go in non-public, secure directory, which is somewhere below the site doc root. I haven't tried running it with permissions less than sudo, though it may be possible. Invoke by doing:

sh path/to/


28 March, 2009

Database backup, dump, and SVN commit script for Drupal workflow

Published in: 

This script is useful for keeping a database within reach of an 'svn up' anywhere it needs to be deployed (e.g., dev and staging sites). In addition to creating a database dump, it also svn commits the database. I use it in conjuction with this provisioning / install script, but it can be used on its own.

How to
The script should go in non-public, secure directory, which is somewhere below the site doc root. I haven't tried running it with permissions less than sudo, though it may be possible. Invoke by doing:

sh path/to/

It has logging and verbose output for confirmation of it's operations on the command line.

Future possiblities includes doing something like this to cut down on the size of the diffs committed, but since haven't gotten a chance to test that in a prodcution environment, am sticking with with plain-and-simple in order to guarantee integrity of the SQL file.

Download (dump and commit, plus update file permissions, a feature not related to backup - simply for maintenance reasons. I use this one myself, but use the other script if this feature is not wanted or needed)

28 March, 2009

Speeding up Drupal Forums

The Drupal forum.module has become, well, somewhat infamous for less than awesome scalability. Recently I had a chance to see this firsthand, and track down a solution for managing the long page load times for a client who has a highly trafficked forum. This was not a case of a site that was un-tuned - actually this particular site had a lot of good work and performance enhancements already done to it, including block caching and even some modifications to the forum module that were allowing to work better than it would have without them. But still 5-6 second page load times on /forum persisted.

As this was my first time working on the site, I began by reviewing all of the main configuration files for Apache, MySQL, and PHP, since they are the foundation for everything else. After making some adjustment there, I headed over to the Drupal admin interface and check /admin/setting/performance/ and made sure all was happy there as well. Finally, I went to the block admin page and double-checked all of the blockcache settings, which as it turns out were set a bit to aggressively, resulting in slow form submission times (every time anyone submitted anything a gazillion blocks were being re-cached whether they need to be or not).

With the foundation of the site now looking good and everything, except the forum pages flying along (and the tracker, but that's a story for a later article) - it was just down to forum.module.

Here are the steps that led to cutting the page load time in half from what they were:

1. Disable the forum.module, rename it to something like "forum.module.orig", make a copy and then rename the copy "forum.module".

2. Download the advcache module and apply the forum patch includes to your new copy of the forum.module.

3. If want the cleanest solution and are comfortable with coding/debugging at all, instead of just copying the forum module and working on it directly (and thus having a hacked Drupal core file around all the time) - name the copied file something completely different than forum.module and edit all of the hook/function calls with in it with new name and place it where you keep the rest of your contributed modules.

4. (Note: this step is an option only if all your forums are public)
Open the module and remove all references to db_rewrite_sql. This will keep Drupal from doing a lot of expensive and uneccesary queries in order to check access rights. (thanks Khalid)

28 February, 2008

Prepare your Drupal site to be Slashdotted, Dugg, and Farked - Part II

Published in: 

It's been a couple weeks since we posted part one of our look at optimizing a Drupal site to withstand large amounts of traffic,
and since that time it happened again - a site we host, got "Farked" (an inbound link from even bigger than it did last time. In the 8 short hours since the link to the client's site went up, and as I write this - the site has received 27,000 + unique viewers. When I logged in there to the site there actually were 1850 users online at the same time.

We just about fell out of our chair when we saw that...

...after all this is a site is that's on a shared server - not a dedicated one. And those kind of numbers would even give some dedicated servers a thorough workout. In the meantime, it was operation 911. Forget the long term issue of finding a larger server space for this site which clearly is outgrowing it's enviroment - that could be handled afterwards. Right now, we had to get the server and site back pronto.

So quickly we posted a "We'll be back soon" message to the deluged site so that the rest of the sites on the server would work again, and then set out to get Farked site back up. Now 27,000 visitors in 8 hours would mean over 75,000 in 24. There's no shared hosting enviroment we know of that can handle a dynamic, php-driven, website with kind of what to do...what to do.

Well, in short order we made an html file out of the page which linked to and put a 301 redirect in the site .htaccess file so that anyone visiting that page would get redirected to the much lighter html file and thereby bypass Drupal and all of the bootstrapping and database overhead. The rest of the pages on the site would function just as they always do, of course. What this did was allow us to take the site live again, serve wayyyy more people than should be possible for on a $20 a month hosting plan, and keep everyone else's site happy and screaming along.

Two other things we did since the time we wrote the first article helped a lot too:

condensed the number of, and compressed, the css files for the site. Simply put, we cut the number of files requested on each page visit by 6 by aggregating all the css into one file. That's a lot of requests when you multiply it by thousands.

Disabled the statistics.module. This is a no-brainer. Apache is logging eveything already and there are more robust tools to process and interpret the logs with than what comes with Drupal (no offense Drupal) - so this is a LOT of overhead that the site/server doesn't need to deal with. Besides the input/output processesing, it also speeds up the database to not have such a larger access log file hangin around.

So there you go - how to make lemonade out of more hits than you should be able to handle. 

15 February, 2007

Import raw content (nodes, users) into Drupal

Published in: 

(This article is made specifically for showing how to get raw content into Drupal, but the methods below can likely be adapted for other use cases [we also imported a list of users using these techniques for instance].)

Importing raw content into Drupal, can be, well let's just say it's not always easy. In two years of experience with Drupal we've imported a MoveableType site, a Blogspot site, content from one Drupal site to the other, and now raw data from a csv file (raw meaning that it did not come from some other CMS/website), and we've used everything from custom scripts, several different Drupal modules along the way including import-exportapi.module, userimport.module, nodeimport.module, WordPresstoDrupal.module.With the exception of the WordPresstoDrupal.module we've been mostly disappointed with the results of our experiments with the Drupal based import options. That said, this may not be the best method to start off with if you can find a ready-made solution that works for you, since it is somewhat time consuming.

Besides being a somewhat complex thing to begin with, the issues around importing content into a Drupal installation are exacerbated by the wide array of variables any point and click solution has to contend with. It's impossible for any module developer to be able to keep up with the number of different/custom Drupal content types for even one version of Drupal, let alone for all new versions of Drupal.

In short, for the serious site-operator/Drupal developer getting raw content imported to Drupal is always going to be a problem because one size is never going to fit all (the 'bad' side of such a customizable platform). This is important to understand in order to fully appreciate the workaround below, which is not particularly easy or fast, but is 99.9% reliable from a data integriy standpoint.

"How's that," you say? Well, because the following method does not actually "import" anything at all, at least not in the traditional sense that most people think of when adding new content into a database. With this method you'll create complete Drupal nodes with SQL, which you will then simply place it in your existing database. Nothing is automatically calculated, scripted, or extrapolated - which is exactly why this method is not easy or particularly fast, but also why it is precise:

In brief:
Create two separate sql import files. One for the "node" table of your database, and another for the "node_revisions" table. The last step will be to update the "sequences" table to let the system 'know' about the new content.

(these "node" and "node_revisions" tables are used only for certain module based content types not for cck nodes - though you could adapt these instructions to create new cck nodes by figuring out what tables you need to populate and using the techniques below)

What you need:

29 January, 2007


Subscribe to RSS - MySQL