Monday, June 25, 2007

INNODB and strings


mysql> CREATE TABLE innodb_string_test (h varchar(512) ) TYPE=INNODB;
Query OK, 0 rows affected, 2 warnings (0.55 sec)

mysql> show warnings;
+---------+------+--------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------------------------+
| Warning | 1246 | Converting column 'h' from CHAR to TEXT |
| Warning | 1287 | 'TYPE=storage_engine' is deprecated; use 'ENGINE=storage_engine' instead |
+---------+------+--------------------------------------------------------------------------+
2 rows in set (0.00 sec)



Technically innodb supports defining varchar's greater then 255 characters (utf8 are double bytes so don't assume that 255 characters mean bytes-FYI).

MySQL the server doesn't allow this.


INNODB as of mysql-5.0.3 store stings in COMPACT format. This increases CPU usage slightly but saves nearly double the diskspace, thus given an application twice the memory. (Smaller data, more of it fits in memory).

To turn this feature off, define table ROW_FORMAT=REDUNDENT.

For more details on the Physical structure of strings in INNODB go here

In summary: use varchar over char.



[edited for correctness - thanks Ken Jacobs]

Looking at this slide from INNOBASE.com - in 5.1 they will start using a COMPRESSED format based off of zlib. Here are some numbers and details:

mysqluc2007











 UncompressedCompressed
File Size 2.8GB1.4 GB
Insert/sec 13001000
CPU Usage 5%-50%15%-50%

1 comment:

Anonymous said...

Actually, this post is talking about two separate types of storage optimizations in InnoDB.

First, beginning with V5.0, InnoDB introduced a "COMPACT" storage representation. It is not only for strings, but actually addresses overall overhead for row storage. In fact, this technique does not involve "compression" at all, but a change in the row format. See the description of the "COMPACT" storage format in the 2006 MySQL User Conference presentation, available on the InnoDB website: http://www.innodb.com/innodb/info.

The other storage optimization is under development, and uses ZIP compression to condense the data (all data, not just strings) on a block-by-block basis. As described in the 2007 MySQL conference presentation, this technique can reduce i/o as well as storage requirements very dramatically, at a small increase in CPU utilization. Compression is a totally different mechanism than the row format change, with even more dramatic effects on resource use.

Note: it is not known at this time whether ZIP compression will be available in the V5.1 timeframe.

Regards,

Ken Jacobs