Friday, July 20, 2007

Replication Syncing MASTER_POS_WAIT

  • MASTER_POS_WAIT


  • This function is useful for control of master/slave synchronization. It blocks until the slave has read and applied all updates up to the specified position in the master log. The return value is the number of log events the slave had to wait for to advance to the specified position. The function returns NULL if the slave SQL thread is not started, the slave's master information is not initialized, the arguments are incorrect, or an error occurs. It returns -1 if the timeout has been exceeded. If the slave SQL thread stops while MASTER_POS_WAIT() is waiting, the function returns NULL. If the slave is past the specified position, the function returns immediately.

    If a timeout value is specified, MASTER_POS_WAIT() stops waiting when timeout seconds have elapsed. timeout must be greater than 0; a zero or negative timeout means no timeout.



    This is great for applications which auto promote masters, but one has to account for race conditions.


    For example, dbmaster1 and dbmaster2 are in bi-directional replication displayed below.


    dbmaster1 <----> dbmaster2


    dbmaster2 has slaves called dbslaves1 and dbslaves2. We want to replace dbmaster2 with dbslave1.

    On dbmaster1 issue:

    FLUSH LOGS;


    On dbmaster2 stop external writes to it and execute

    SELECT MASTER_POS_WAIT('dbmaster1-bin.000002', 4); SLAVE STOP;


    db3>show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State:
    Master_Host: dbmaster1
    Master_User: replication
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: dbmaster1-bin.000002
    Read_Master_Log_Pos: 3696603
    Relay_Log_File: dbslave-relay.000005
    Relay_Log_Pos: 597702176
    Relay_Master_Log_File: dbmaster1-bin.000002
    Slave_IO_Running: No
    Slave_SQL_Running: No
    Replicate_Do_DB:
    Replicate_Ignore_DB:
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 805
    Relay_Log_Space: 601376092
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: NULL




    Notice Exec_Master_Log_Pos is at POS 805. This is a small window of events that passed. So, make sure to account for it.


    Now on dbmaster2 issue:

    FLUSH LOGS

    Do the same steps for dbslave1-2 with MASTER_WAIT_POS but for the position on dbmaster2 not dbmaster1.

    Make dbslave1 a master by adding

    log-bin to my.cnf

    Issue CHANGE MASTER TO on dbslave1 to dbmaster1 // this moves dbslave1 to dbmaster1

    CHANGE MASTER TO MASTER_HOST='dbmaster1', MASTER_LOG_POS=805, MASTER_LOG_FILE='dbmaster1.000002';

    Issue CHANGE MASTER TO on dbslave2 to dbslave1 // this moves dbslave2 to dbslave1

    CHANGE MASTER TO MASTER_HOST='dbslave1', MASTER_LOG_POS=4, MASTER_LOG_FILE='dbslave1-bin.000001';


    Enable writes on dbslave1

    done.

    I wrote an application that does this all for me, using IPC, fork, db connections and the "algorithm" above. If your interested in it. I'll post it here.

    3 comments:

    Anonymous said...

    I think you definitely need to take a look at MMM.

    pakfak said...

    Thanks a lot fro this statement it helps me a lot.
    Do you have an idea how to insert the show master status returns into a table?
    I thoght that the whole information with bin-log files names and positions are kept into some system tables in mysql but i was wrong. Any idea?

    pakfak said...

    Thanks a lot fro this statement it helps me a lot.
    Do you have an idea how to insert the show master status returns into a table?
    I thoght that the whole information with bin-log files names and positions are kept into some system tables in mysql but i was wrong. Any idea?