Monday, May 16, 2011

Installing GearmanD on Amazon's EC2 Linux AMI Small Instance

Linux AMI is very close to Centos/RedHat but its Amazon's own distro. Here are some quick steps on installing Gearmand on your system. I am currently using it to distribute jobs across many instances, to run asynchronously or synchronously where the Apache CPU isn't blocked on long running procs, like fetching data from a website realtime, massaging the data and returning it to the browser or resizing images.


#
# put the stuff in /var/tmp
#

cd /var/tmp;

#
# get the source
#
wget http://launchpad.net/gearmand/trunk/0.20/+download/gearmand-0.20.tar.gz

#
# setup libs required for the source to complie
#

yum install -y libevent-devel.i386
yum install -y gcc-c++.i386
yum instal -y boost-devel.i386 // C++ libs
yum install -y libuuid-devel.i686 -- do not install the i386 version, it puts uuid.h inside /usr/include not /usr/include/uuid/
yum install -y memcached-devel.i686

#
# extract the source
#

tar xvzf gearmand-0.20.tar.gz

#
# configure / make / make install
#

cd gearmand-0.20
./configure --prefix=/usr
make && make test
make install

#
# add the user and run it
#
adduser gearmand
/usr/sbin/gearmand -u gearmand


#
# need a client
#

Now install PECL GearMan

#
# client is not stable, thus use beta
#

pecl install channel://pecl.php.net/gearman-0.7.0

you should now see

Build process completed successfully
Installing '/usr/lib/php/modules/gearman.so'
install ok: channel://pecl.php.net/gearman-0.7.0
configuration option "php_ini" is not set to php.ini location
You should add "extension=gearman.so" to php.ini


After following the directions of extension=gearman.so

php --info |grep gear
gearman
gearman support => enabled
libgearman version => 0.20


now restart apache


Tada now for some client code:

One thing that I hate having to do is restart services when pushing code. I just want it to work. So using a Wrapper/Bridge design pattern in conjunction with restartd (using supervised pyton kit is just not possible) My new code is ready as soon as it makes it to disk.

Here is how I did it:

Three classes

GearmanJobSubmitter.php

GearmanJobPerformer.php

GearmanJobGeneric.php

The performer delegates the job to GearmanJobGeneric and has a method called JobsWrapper(GearMan $job)

JobsWrapper by looking at the workload is able to determine which Job to call. If the file that contains the meat of the Job's mtime has changed jobs wrapper will throw an exception and ext, otherwise execute the job.

If the wrapper killed itself, restartd then sees that the worker is not running and starts it back up.


Currently I have GearmanD managing file uploads, stat logging, data collecting etc.





No comments: