Tuesday, March 31, 2009

What do you think about adding ZLIB to memcache storage

Memcache is a fantastic Hash table-very fast and one of the great successes of Brad Fitzpatrick-who in my opinion has done more for the open social movement as an individual then anyone else. I use memcache quite extensively, now I am thinking about adding ZLIB native to compress the value of each key-much like how INNODB does with the Barracuda file format. The theory is with a CPU hit, we can store more data per memcache instance. I've talked to the Northscale guys and they love the idea. What do you think?


Update: Well what do you know

http://us3.php.net/manual/en/function.memcache-setcompressthreshold.php

for PHP for instance compresses the data on the client side.

There still might be some value compressing the data on the server-side, but now I'm not as motivated.

What might be a good alternative is to compress keys into 8 byte longs in memcached automatically instead of the actual string that can be huge. To give some more detail,
8 byte longs is a 64-bit int. A string can easily be converted into a big int by bit manipulation - and the address space is huge so key conflict is effectively removed.

8 comments:

rsynnott said...

Sounds reasonable, as an option. It would not work well for all workloads.

Nathan said...

Compress on the client side, that's what the flags byte is for and that's what happens already in most client libs I've seen. Client-side cuts down on network traffic as well.

Anonymous said...

You mean extending libmemcached/etc to do compression on keys before sending?

I'd take a patch on this.

Anonymous said...

It's already an option in a number of clients, generally triggered on the size of the value being put into cache.

Mark Robson said...

Just do it in the client - it's more scalable there. I'm assuming you generally have more clients than servers ( this is not necessarily true in every configuration)

Dathan Pattishall said...

@topbit what client does this already? It handles zlib compress automatically?

@krow: yes (I think you mean values)

@mark: on the client I'm running out of CPU cycles.

Dathan Pattishall said...

@Nathan what client does this, I thought it was compression for net bytes only. For instance the PECL memcache lib doesn't indicate the data is stored compress.

Anonymous said...

I think server-side compression would be a great thing in memcached server, especially when client is already under heavy CPU load and cannot afford to compress. I'd very much like to see this functionality in memcached.