Replication failure handling

Replication failure handling replicating databasesfailure handling database replicationfailure handling

Replication can encounter several failure situations. The following table lists these situations and describes the actions that takes as a result.

Replication failure handling Failure Situation Action Taken Master loses connection with slave. Transactions are allowed to continue processing while the master tries to reconnect with the slave. Log records generated while the connection is down are buffered in main memory. If the log buffer reaches its size limit before the connection can be reestablished, the master replication functionality is stopped. You can use the property derby.replication.logBufferSize to configure the size limit of the buffer; see the for details. Slave loses connection with master. The slave tries to reestablish the connection with the master by listening on the specified host and port. It will not give up until it is explicitly requested to do so by either the failover=true or stopSlave=true connection URL attribute. If a failover is requested, the slave applies all received log records and boots the database as described in . If the stopSlave=true attribute is specified, the slave database is shut down without further actions. Two different masters of database D try to replicate to the same slave. The slave will only accept the connection from the first master attempting to connect. Note that authentication is required to start both the slave and the master, as described in . The master and slave instances are not at the same version. An exception is raised and replication does not start. The master instance crashes, then restarts. Replication must be restarted, as described in . The master instance is not able to send log data to the slave at the same pace as the log is generated. The main memory log buffer gradually fills up and eventually becomes full. The master notices that the main memory log buffer is filling up. It first tries to increase the speed of the log shipment to keep the amount of log in the buffer below the maximum. If that is not enough to keep the buffer from getting full, the response time of transactions may increase for as long as log shipment has trouble keeping up with the amount of generated log records. You can use properties to tune both the log buffer size and the minimum and maximum interval between consecutive log shipments. See the for details. The slave instance crashes. The master sees this as a lost connection to the slave. The master tries to reestablish the connection until the replication log buffer is full. Replication is then stopped on the master. Replication must be restarted, as described in . An unexpected failure is encountered. Replication is stopped. The other instance of the replication pair is notified of the decision if the network connection is still alive.