Cloning a MySQL Slave
09 November 2011

Cloning a MySQL slave is typically very straight forward.

  1. Execute stop slave on the donor slave and capture the slave status information
  2. Stop mysql on the donor
  3. Copy the database files from the donor to the new slave
  4. Start MySQL on the new slave
  5. Execute the change master statement to start the new slave's replication process
  6. Start mysql on the donor and allow replication to catch up

Simple right? It is, if you don't run into the scenario I managed to hit. Show slave status gives you a lot of information like this:

                  Master_Host: 10.10.10.10
                  Master_User: repl_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.002644
          Read_Master_Log_Pos: 1015419943
               Relay_Log_File: relay-log.000257
                Relay_Log_Pos: 68175060
        Relay_Master_Log_File: binlog.002643
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 887594041
              Relay_Log_Space: 2448352803

This information is then matched into a change master statement. The most important values here are MASTER_LOG_FILE and MASTER_LOG_POS which tell the slave thread where to start replicating from. This will be the point where the donor slave was stopped before the copy. Now, you'd think that the master log file would correspond with the Master_Log_File value from show slave status. It doesn't. You want to use Relay_Master_Log_File. Most often, these log files are the same - if replication isn't lagging. But even in instances where the slave is not behind there is a small chance you'll catch it as the IO thread is getting the next binlog file. Now if you use the log file you think is correct, you'll be setting your replication to start too far ahead. Now, you may get lucky like me and the position won't exist in the binlog you set, and you'll get a replication IO error in my case, 1236 "Exceeded max_allowed_packet", instructing you to set the max_allowed_packet larger on the master which is misleading in this case.

The CHANGE MASTER statement for the above slave status should be

CHANGE MASTER TO
MASTER_LOG_FILE='binlog.002643'
MASTER_LOG_POS=887594041
The log position is the value from Exec_Master_Log_Pos.

Hopefully this will keep you from bludgeoning your forehead on your desk as I nearly did.