Friday, September 03, 2010
Cassandra and Ganglia
I finally got some time to do some house cleaning. One of my nagging low-hanging fruit jobs was to stop using jconsole as my monitor. I created a ganglia script to graph what is above. The image illustrated above I am showing all the Cassandra servers and their total row read stages completed in the last hour as a gauge. In essence I am graphing the delta of the change between ganglia script runs.
How I have it set up is:
All data exposed by JMX to produce tpstats and cfstats is graphed via ganglia. The pattern for each graph is as follows
stat_class - tpc, tpp, tpa means complete, pending, active respectively
key - would be message deserialization for instance.
For column family stats I graph the keyspace stats as well as the specific column family stats exposed by cfstats. For instance below:
If you’re interested in the scripts I'll send it to you or put it up on code.google.com, its written in perl OOP perl and takes the same approach of packaging that maatkit tool kit for mySQL by Xarb and crew does (puts all the "classes" in the file as the application).
GmetricDelegate is the parent package
GmetricCassandra extends GmetricDelegate and overloads getData as well as defines what is an absolute stats vrs a gauge.
As you can see the pattern I also have
and so on.
then on each server I run
/usr/bin/perl -w /home/scripts/ganglia_gmetric.pl --module=GmetricCassandra
this then talks to Ganglia through gmetric to report the stats.
Update: I uploaded an alpha version to http://code.google.com/p/gangliastats/ - be warned sparse comments I'll have another check in with documentation soon.