Friday, December 29, 2006

INNODB_FILE_PER_TABLE

So, if your a frequent reader of this blog you know that I've mentioned thatI federates data.

So, if all servers are the same, query load is the same, and the same amount of data exists on all the shards then the overall load average and IO load metrics should be the same between each server containing it's slice of the data (these are called shards).

Well, looking over some very detailed stats, I noticed that these new boxes that are in production are kicking some major ass, these new boxes are able to handle nearly 5 times more data and queries. To good to believe, since overall they are the same as my old boxes but the main difference is the CPU in them.

Then I looked at some more my.cnf.tmpl tweaks (I have a my.cnf.tmpl that generates my.cnf by replacing certain fields when /etc/init.d/mysqld start is invoked). Low and behold I added innodb_file_per_table and innodb_open_files=1024. This must be the reason right?

In the past I've bench marked innodb_file_per_table, raw device, and a single table space and found no real difference in throughput, but now I'm seeing that these new boxes with this option is giving me an order of magnitude better performance. I just can't believe that a single option enabled can do this. I start looking at what is the different between two distinct shards.

Each shard has the same amount of memory and disk layout. The newer shard has x2 DUAL CORE CPU's so it looks like 4 CPU to Linux. Additionally the new shard has a newer Linux kernel. Hmmm is the combination of this feature + the new CPU the reason for the order of magnitude better performance. This is not a good justification for this blessing.

Well the older shard has been around for a year, while the newer one has been around for a few months, but this is not Windows: the servers do not degrade over time :P. But, wait INNODB does degrade over time. There is a lot of data inside the single table space that has been removed! The table space on the older shard has been around for a year, while the new table space on the newer DUAL Core CPU shards have only been around for a few months. Thus there are more gaps in the old table space then the new one. More gaps means more disk head movement and more random io for the same operation, when compared to the newer shard. This is the reason!

In conclusion, a huge amount of performance / scale can be gained out of a old database by rebuilding an INNODB table space. Personally I'm a big fan of innodb_file_per_table, since the win is easier maintenance and space control (it can reclaim gaps of data with an optimize table.)


How to rebuild an INNODB table space and reclaim disk space:

    Single table space:
  1. Dump the data to a text file

  2. Make a backup of the old table space

  3. stop mysql, rm -f the table space and log files

  4. start mysql

  5. re-import the data




    file per table table space (4.1 syntax):
  1. ALTER TABLE TYPE=INNODB;

3 comments:

Brian Moon said...

If you can live with MyISAM tables for a short while and want to reduce down time, you can convert all of your InnoDB tables to MyISAM tables instead of dumping them. This gives you the ability to stay up and running if you have lots of tables to convert. Then just convert them back to innodb after you remove the innodb table space.

Dathan Pattishall said...

This doesn't shrink the table space. Converting to myISAM then back to INNODB gives you no wins. If I wanted to just create the file per table I can also run

ALTER TABLE [tablename] TYPE=INNODB

Anonymous said...

Interesting, I'll have to play with that. In addition to database considerations, I'll throw in that the per table approach is also much better from a system administration perspective.

MyISAM is easier for moving data around/making backups, because the files are in distinct directories, with their own files per table. Grabbing one table and putting it on another server is much easier, and faster, because you don't have use mysqldump.

With innodb_file_per_table, InnoDB can be just as easy for system administrators.

-Nick
Wikinvest & Yahoo
http://www.yahoo.com
http://www.wikinvest.com/?utm_source=ff&utm_medium=email&utm_term=nick