Wednesday, 28 May 2014

How to setup crash-safe-slave replication with MySQL 5.6

Today when i was working on crashed (Because of power failure) dev replication DB instance, came to know that few defaults needs to be changed to set up "crash-safe slave" on MySQL 5.6.

Before start discuss about actual problem first let me share the configuration options used for existing DB instance.

The DB instance has been configured with master_info_repository=TABLE, relay-log-info-repository=TABLE.

After the server restart, started the DB instance and it came up without any issues. Then logged into the DB server checked for "show slave status;"

I was shocked by looking at the information from the "show slave status\G", it was showing

Master_Log_File: bin_1840.000001
Read_Master_Log_Pos: 43762
Relay_Log_File: relay_1836.000002
Relay_Log_Pos: 87537
Relay_Master_Log_File: bin_1840.000001
Exec_Master_Log_Pos: 87375

if we look at "Read_Master_Log_Pos:" its saying position 43762, but if we look at "Exec_Master_Log_Pos:" which is showing as 87375 and both "Master_Log_File:" and "Relay_Master_Log_File:" are working on same file "bin_1840.000001" how could this possible. exec master log position should always be less than read master log position, because what ever the slave has been read from the master will get executed at the slave side. This is strange and i am facing this kind of issue for first time.

So started browsing some information on the same, i found that from MySQL 5.6 onwards the default values have been changed to 10000 for the following variables

sync_master_info
sync_relay_log
sync_relay_log_info

If the value of these variables is greater than 0, the slave updates its master info repository, slave relay log and slave info repository table after every N events set by these variables. If it is 0, the table is never updated.
For more information go through.

That means the slave will not update its status after each event execution, it will wait till 10000 events gets executed on the slave and then only it will update the slave information, the same has happened over here. slave has updated the slave_master_info table at the log position 43762 and it is waiting to complete 10000 events for next update, but mean while the slave has been crashed which is why it is showing "Read_Master_Log_Pos:" as 43762 after restart, which is understandable.

But still its not clear how slave has been updated to "Exec_Master_Log_Pos:" 87375. Got few details from the following output.

mysql> select * from mysql.slave_relay_log_info\G

*************************** 1. row ***************************
  Number_of_lines: 7
   Relay_log_name: /xxxxx/var/log/binlogs/relay_1836.000002
    Relay_log_pos: 87537
  Master_log_name: bin_1840.000001
   Master_log_pos: 87375
        Sql_delay: 0
Number_of_workers: 0
               Id: 1

Again if we go back to "sync_relay_log_info", The effects of this variable on the slave depend on the server's relay_log_info_repository setting (FILE or TABLE), and if this is TABLE, additionally on whether the storage engine used by the relay log info table is transactional (such as InnoDB) or not (MyISAM). If relay_log_info_repository has been set to TABLE and if the table is created as a InnoDB table then The table is updated after each transaction. (N is effectively ignored.) , If the table is created as a MyISAM table The table is updated after every N events. For more information go through.

So now its clear why the "Read_Master_Log_Pos:" is @43762 and "Master_log_pos:" is @87375 and the problem is clearly understood, then what is the solution to this problem.

If we look at "relay_log_recovery" Enables automatic relay log recovery immediately following server start up, which means that the replication slave discards all unprocessed relay logs and retrieves them from the replication master, which is clearly says that slave will start reading master binary logs from the (Exec_Master_Log_Pos: 87375 or Master_log_pos: 87375) for more information go through.

So started the slave by adding "relay_log_recovery=ON" variable to configuration file and logged into the DB instance and checked for show slave status, now (Read_Master_Log_Pos: 87375 and Exec_Master_Log_Pos: 87375) pointing to the same position which is solve my problem.

Master_Log_File: bin_1840.000001
Read_Master_Log_Pos: 87375
Relay_Log_File: relay_1836.000003
Relay_Log_Pos: 4
Relay_Master_Log_File: bin_1840.000001
Exec_Master_Log_Pos: 87375

Started salve, after some time the slave got sync with master without any issues. checked for consistency between master and slave they are consistent.

So finally came to know the the following configuration options will make sure that the crash-safe-slave will work without any issues.

master_info_repository=TABLE
relay-log-info-repository=TABLE
relay_log_recovery=ON

Additional reference : http://www.mysqlperformanceblog.com/2013/09/13/enabling-crash-safe-slaves-with-mysql-5-6/



MySQL InternalsMySQL HighAvailabilityMySQL Performance TuningMySQL Query OptimizationMySQL performanceMySQL FabricMySQL HAMySQL InstallationMySQL UpgradeInnoDB Performance TuningInnoDB Buffer Pool SizeMySQL Performance TuningMySQL ClusterMySQL Latest NewsNews and EventsMySQL Customers 

No comments:

Post a Comment