5.5.1. Automatic Master Switch

When the dataservice policy mode is AUTOMATIC , the dataservice will automatically switch the master host when the existing master is identified as having failed or become unavailable.

For example, when the master host host1 becomes unavailable because of a network problem, the dataservice automatically switches to host2 . The dataservice status is updated accordingly, showing the automatically shunned host1 :

[LOGICAL:EXPERT] /alpha > ls

COORDINATOR[host3:AUTOMATIC:ONLINE]

ROUTERS:
+----------------------------------------------------------------------------+
|connector@host2[28116](ONLINE, created=0, active=0)                         |
|connector@host3[1533](ONLINE, created=0, active=0)                          |
+----------------------------------------------------------------------------+

DATASOURCES:
+----------------------------------------------------------------------------+
|host1(master:SHUNNED(FAILED-OVER-TO-host2))                                 |
|STATUS [SHUNNED] [2013/05/14 12:18:54 PM BST]                               |
+----------------------------------------------------------------------------+
|  MANAGER(state=STOPPED)                                                    |
|  REPLICATOR(state=STATUS NOT AVAILABLE)                                    |
|  DATASERVER(state=ONLINE)                                                  |
|  CONNECTIONS(created=0, active=0)                                          |
+----------------------------------------------------------------------------+

+----------------------------------------------------------------------------+
|host2(master:ONLINE, progress=156325, THL latency=0.606)                    |
|STATUS [OK] [2013/05/14 12:46:55 PM BST]                                    |
+----------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                     |
|  REPLICATOR(role=master, state=ONLINE)                                     |
|  DATASERVER(state=ONLINE)                                                  |
|  CONNECTIONS(created=0, active=0)                                          |
+----------------------------------------------------------------------------+

The status for the original master (host1) identifies the datasource as shunned, and indicates which datasource was promoted to the master in the FAILED-OVER-TO-host2 .

A automatic failover can be triggered by using the datasource fail command:

[LOGICAL:EXPERT] /alpha > datasource host1 fail

This triggers the automatic failover sequence, and simulates what would happen if the specified host failed.

If host1 becomes available again, the datasource is not automatically added back to the dataservice, but must be explicitly re-added to the dataservice. The status of the dataservice once host1 returns is shown below:

[LOGICAL:EXPERT] /alpha > ls

COORDINATOR[host3:AUTOMATIC:ONLINE]

ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[19869](ONLINE, created=0, active=0)                         |
|connector@host2[28116](ONLINE, created=0, active=0)                         |
|connector@host3[1533](ONLINE, created=0, active=0)                          |
+----------------------------------------------------------------------------+

DATASOURCES:
+----------------------------------------------------------------------------+
|host1(master:SHUNNED(FAILED-OVER-TO-host2), progress=156323, THL            |
|latency=0.317)                                                              |
|STATUS [SHUNNED] [2013/05/14 12:30:21 PM BST]                               |
+----------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                     |
|  REPLICATOR(role=master, state=ONLINE)                                     |
|  DATASERVER(state=ONLINE)                                                  |
|  CONNECTIONS(created=0, active=0)                                          |
+----------------------------------------------------------------------------+

Because host1 was previously the master, the datasource recover command verifies that the server is available, configures the node as a slave of the newly promoted master, and re-enables the services:

[LOGICAL:EXPERT] /alpha > datasource host1 recover
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'host1'
DATA SERVER 'host1' IS NOW AVAILABLE FOR CONNECTIONS
RECOVERING 'host1@alpha' TO A SLAVE USING 'host2@alpha' AS THE MASTER
SETTING THE ROLE OF DATASOURCE 'host1@alpha' FROM 'master' TO 'slave'
RECOVERY OF 'host1@alpha' WAS SUCCESSFUL

If the command is successful, then the node should be up and running as a slave of the new master.

The recovery process can fail if the THL data and dataserver contents do not match, for example when statements have been executed on a slave. For information on recovering from failures that recover cannot fix, see Section 5.6.1.3, “Slave Datasource Extended Recovery” .