Monday, May 16, 2011

Installing GearmanD on Amazon's EC2 Linux AMI Small Instance

Linux AMI is very close to Centos/RedHat but its Amazon's own distro. Here are some quick steps on installing Gearmand on your system. I am currently using it to distribute jobs across many instances, to run asynchronously or synchronously where the Apache CPU isn't blocked on long running procs, like fetching data from a website realtime, massaging the data and returning it to the browser or resizing images.

# put the stuff in /var/tmp

cd /var/tmp;

# get the source

# setup libs required for the source to complie

yum install -y libevent-devel.i386
yum install -y gcc-c++.i386
yum instal -y boost-devel.i386 // C++ libs
yum install -y libuuid-devel.i686 -- do not install the i386 version, it puts uuid.h inside /usr/include not /usr/include/uuid/
yum install -y memcached-devel.i686

# extract the source

tar xvzf gearmand-0.20.tar.gz

# configure / make / make install

cd gearmand-0.20
./configure --prefix=/usr
make && make test
make install

# add the user and run it
adduser gearmand
/usr/sbin/gearmand -u gearmand

# need a client

Now install PECL GearMan

# client is not stable, thus use beta

pecl install channel://

you should now see

Build process completed successfully
Installing '/usr/lib/php/modules/'
install ok: channel://
configuration option "php_ini" is not set to php.ini location
You should add "" to php.ini

After following the directions of

php --info |grep gear
gearman support => enabled
libgearman version => 0.20

now restart apache

Tada now for some client code:

One thing that I hate having to do is restart services when pushing code. I just want it to work. So using a Wrapper/Bridge design pattern in conjunction with restartd (using supervised pyton kit is just not possible) My new code is ready as soon as it makes it to disk.

Here is how I did it:

Three classes




The performer delegates the job to GearmanJobGeneric and has a method called JobsWrapper(GearMan $job)

JobsWrapper by looking at the workload is able to determine which Job to call. If the file that contains the meat of the Job's mtime has changed jobs wrapper will throw an exception and ext, otherwise execute the job.

If the wrapper killed itself, restartd then sees that the worker is not running and starts it back up.

Currently I have GearmanD managing file uploads, stat logging, data collecting etc.

No comments: