Monday, February 06, 2012

SpyMemcached Transcoder with PHP PDO Objects using ZLIB

My technology stack services more then 2 Million Daily Active users.  Its very basic. PHP talks to mySQL, Memcache, RabbitMQ, Gearman and Facebook.  Now that we have more Java specifically to support our SmartFox Server and other services, blurring the lines between what data is set in PHP and what data is read in Java is very necessary.

Java J-Connect makes reading mySQL Data as simple IMHO as PHP's PDO. What is hard is reading PHP's serialized format from PHP's Memcache library.

In PHP there are two main C backed Libraries. There is Memcache the original PHP library which I happen to use, and Memcached which is the library I wanted to use but didn't deploy since EC2 package system conflicted and cause issues (I fixed them but to late to deploy). Memcache stores data in PHP's serialized format and compresses it via ZLIB, while Memcached can store data as PHP's serialized format, JSON, Binary Serialized (which is rather awesome), JSON Array Notation and has a multitude of compressing formats none of which are pure ZLIB that I noticed.

Here is the problem. Spymemcached is a lib for talking to memcache but can't unserialized PHP serialized format (or read it natively and return a string) and cannot Decompress ZLIB but can Decompress GZIP. Now a great speed up would be to use PHP's serialized data set from PHP and share memcache resources from PHP and Java just like what is done for the mySQL resources.

What needs to be done? Well, build your own Transcoder for Spymemcached. Fortunately Spymemcached documented an interface to do just that.

What is needed. Implement the spymemcached Interface defined here. Use org.lorecraft.phparser to unserialize PHP data  defined here. Return the Object.

 Below is the code.

package com.schoolfeed.spymemcached;


import net.spy.memcached.CachedData;
import net.spy.memcached.compat.CloseUtil;
import net.spy.memcached.transcoders.BaseSerializingTranscoder;
import net.spy.memcached.transcoders.Transcoder;
import org.lorecraft.phparser.*;

public class PHPSerializedTranscoder extends BaseSerializingTranscoder implements Transcoder <Object> {
 static final int COMPRESSED=2;
  * Get a serializing transcoder with the default max data size.
 public PHPSerializedTranscoder() {

  * Get a serializing transcoder that specifies the max data size.
 public PHPSerializedTranscoder(int max) {
  * decode the byte data from Memcache decompress it if necessary and return the Object
  * @param CacheData - the byte data is turned into a object
  * @returns Object 
 public Object decode(CachedData d){
  byte[] data=d.getData();
  Object rv=null;
  String ds="N;";
  if((d.getFlags() & COMPRESSED) != 0) {
   getLogger().debug("Looks like d is compressed");
  getLogger().debug("DECODED: [" + ds + "] about to SerializedPhpParser");
  SerializedPhpParser sp = new SerializedPhpParser(ds);
  try {
   rv = sp.parse();
   getLogger().debug("Parse was cool!!");
  } catch(Exception e){
   getLogger().debug("Not a PHP Object? : " +  ds);
   rv = ds;
  return rv;
  * PHP Memcache stores compress data in ZLIB format override the base class decompress method to handle ZLIB
  * @param byte array - raw data from Memcache
  * @returns byte array
 protected byte[] decompress(byte[] in) {
  ByteArrayOutputStream bos=null;
  final int BUFFER = 2048;
  if(in != null) {
   ByteArrayInputStream bis=new ByteArrayInputStream(in);
   bos=new ByteArrayOutputStream();
   InflaterInputStream iis = null;
   try {
    iis = new InflaterInputStream(bis);

    byte[] buf=new byte[BUFFER];
    int r=-1;
    while((, 0, BUFFER)) > 0) {
     bos.write(buf, 0, r);
   } catch (IOException e) {
    getLogger().warn("Failed to decompress data", e);
    bos = null;
   } finally {
  return bos == null ? null : bos.toByteArray();
  * encode -- not implemented yet
 public CachedData encode(Object o){
  int flags = 0;
  byte[] b=null;
  return new CachedData(flags, b, getMaxSize());
  * no need to async Decode let's do it realtime
 public boolean asyncDecode(CachedData d) {
  return false;


This is a stop-gap solution until we make the transition to Memcached with JSON encoding. Then I can use Jackson-JSON - which is a fast JSON encoder/decoder for Java enabling a great portable message protocol between the two stacks and nearly any other language we might add to the system (like Python).


Chuck Hagenbuch said...

Side question - do you do inline message delivery to rabbitMQ? Or defer it out of the PHP process somehow? Have you had any experience with rabbitMQ delaying messages and holding on to connections? We've been dealing with some problems where an overloaded rabbit server can eventually eat up all Apache processes. We've been developing a logfile backup and faster timeouts to deal with it, but I'm curious how others handle this.

Dathan Pattishall said...

I'm glad you asked this question. Let me 1st say it was a pain in the ass to get right.

Basically what I do is set auto-delete on the queues and exchanges as well as auto-ack

Next I put a time limit on how long messages can exist in the queue

Finally I limit the number of consumers per queue

Memory pressure causes rabbit to block producers increasing APACHE http procs if writing from a web process. I specifically wanted to avoid writing to a queue to write to another queue.

Anonymous said...

Kind of wish php-memcached would support MsgPack

Which has larger support in other languages than igbinary.