Monday, February 25, 2008

BCP mySQL without loosing speed

BCP stands for Business Continuity Planning - basically a fancy name to describe handling the situation when a DC goes offline. Since we use dual master replication for our servers - putting a master in another DC is not possible - without a special layer. The reason: A slave can have only one master, while a master can have many slaves. So, mysql replication for a dual master setup is not valid, unless there is a replication ring from one DC to another DC. You do not want to do this due to latency.

To get around these limiting factors in mysql, we have developed an application, using my design, to get write events from one DC to another DC over a stunnel.

To get data from one DC to another DC we need to do it encrypted. If there is a man in the middle, we don't want them sniffing our traffic-so we use stunnel to encrypt the data. I will not use SSL encrypted replication since managing that requires a mysql restart. For something as simple as replication data, we shouldn't have to restart mysql.


stunnel setup

here is a good writeup


For the version of stunnel that we are running I needed to add this: wait_for_readable = 0 - which tells the accepting server NOT to wait for headers to be sent.


Now to make sure it's fast, we need to transfer as little data as possible-I will not go into details of how this is implemented yet. If people are interested I may release a generic version of the application that will get data from one location to another location, consistent, ordered etc-which is application independent. The last statement means that the application does not need to change for this to work.

4 comments:

Anonymous said...

Good stuff! At the 2007 MySQL conference, some people had touched on this, but nobody went into the details. It would be cool if someone actually went into the details at this years conference during a session or BOF. Thanks for the post!

Unknown said...

I'd be very interested in more information about this setup!

Anonymous said...

Interesting, we use ssh tunnels to do inter-datacentre replication with the autossh command managing reconnects. This also allows us to use compression.

I may have the wrong end of the stick but can't you use log-slave-updates on either of your master-master pair and run your remote "master in waiting" from that? Not a ring but two masters and a complete slave.

Dathan Pattishall said...

@smikybang.com

Each data center has 2 masters. So setting up a slave to a master in another datacenter does not achieve BCP since a slave can have only one master.

An external process manages this layer.