Thursday, December 13, 2007

Flickr Stats how is it built

    Flickr Stats how is it built.
  • All Collection is done realtime
  • MYISAM and INNODB is used
  • The data is spread across 6 clusters (12 servers-6 used, 6 for fail over) mainly for data storage requirements
  • Memcache is not used at all in the core of the product.




In summary this was the longest project that I worked on, other then rebuilding the backend for Flickr when I first came on. The inner workings are very complex to achieve real-time collection-while not affecting page load times of a photo page. Most of my time was spent on creating a distributed lock once my DB design was solid.

Things that would really make life easier:

MYSQL AB gets rid of MYISAM and makes PBXT its replacement. I don't need all the great features of INNODB but I would like some. I'll go more into this later.

Additionally INSERT DELAYED worked with ON DUPLICATE UPDATE. Currently it does not.

Finally cross Engine Transactions would be cool but really not required.

4 comments:

Unknown said...

That sounds quite impressive, would be really nice if you could post some details.
Don't know how much you can/want tell here but maybe some information about the general setup, data volumes etc.?

Anonymous said...

So, logging into MyISAM with insert delayed for non-blocking INSERTs and scripts to summarise/transfer the data to InnoDB?

Unknown said...

Isn't PBXT more InnoDB like ?
Or it is easier to setup than InnoDB, just as easy as MyISAM ?

Dathan Pattishall said...

@adrenalin PBXT is like INNODB but it's not ACID complient. It is also like MYISAM but recovery from crashes is much faster. It's INSERT speed at high concurrency is much better then INNODB.