6.5.1. Automatic Primary Failover

6.5.1. Automatic Primary Failover
Prev	^Up	6.5. Switching Primary Hosts	Next

6.5.1. Automatic Primary Failover

When the dataservice policy mode is AUTOMATIC , the dataservice will automatically failover the Primary host when the existing Primary is identified as having failed or become unavailable.

For example, when the Primary host db1 becomes unavailable because of a network problem, the dataservice automatically switches to db3 . The dataservice status is updated accordingly, showing the automatically shunned db2 :

[LOGICAL:EXPERT] /alpha > ls

COORDINATOR[db1:AUTOMATIC:ONLINE]

ROUTERS:
+---------------------------------------------------------------------------------+
|connector@db1[7435](ONLINE, created=2, active=0)                                 |
|connector@db2[7472](ONLINE, created=2, active=0)                                 |
|connector@db3[7468](ONLINE, created=2, active=0)                                 |
+---------------------------------------------------------------------------------+

DATASOURCES:
+---------------------------------------------------------------------------------+
|db1(master:SHUNNED(FAILED-OVER-TO-db3), progress=8, THL latency=0.981)           |
|STATUS [SHUNNED] [2025/01/27 01:51:23 PM UTC]                                    |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=master, state=DEGRADED)                                        |
|  DATASERVER(state=STOPPED)                                                      |
|  CONNECTIONS(created=4, active=0)                                               |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|db2(slave:ONLINE, progress=8, latency=1.004)                                     |
|STATUS [OK] [2025/01/27 01:51:40 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=slave, master=db1, state=ONLINE)                               |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=0, active=0)                                               |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|db3(master:ONLINE, progress=10, THL latency=0.380)                               |
|STATUS [OK] [2025/01/27 01:51:27 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=master, state=ONLINE)                                          |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=2, active=0)                                               |
+---------------------------------------------------------------------------------+

The status for the original Primary (db1) identifies the datasource as shunned, and indicates which datasource was promoted to the Primary in the FAILED-OVER-TO-db3 .

A automatic failover can be triggered by using the datasource fail command:

[LOGICAL:EXPERT] /alpha > datasource db1 fail

This triggers the automatic failover sequence, and simulates what would happen if the specified host failed.

If db1 becomes available again, the datasource is not automatically added back to the dataservice, but must be explicitly re-added to the dataservice. The status of the dataservice once db1 returns is shown below:

[LOGICAL:EXPERT] /alpha > ls

...

+---------------------------------------------------------------------------------+
|db1(master:SHUNNED(FAILED-OVER-TO-db3), progress=8, THL latency=0.981)           |
|STATUS [SHUNNED] [2025/01/27 01:51:23 PM UTC]                                    |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=master, state=DEGRADED)                                        |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=4, active=0)                                               |
+---------------------------------------------------------------------------------+

...

Because db1 was previously the Primary, the datasource recover command verifies that the server is available, configures the node as a Replica of the newly promoted Primary, and re-enables the services:

[LOGICAL:EXPERT] /alpha > datasource db1 recover
RECOVERING DATASOURCE 'db1@alpha'
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'db1'
Verified that DB server notification 'db1' is in state 'ONLINE'
DATA SERVER 'db1' IS NOW AVAILABLE FOR CONNECTIONS
RECOVERING 'db1@alpha' TO A SLAVE USING 'db3@alpha' AS THE MASTER
SETTING THE ROLE OF DATASOURCE 'db1@alpha' FROM 'master' TO 'slave'
RECOVERY OF 'db1@alpha' WAS SUCCESSFUL

If the command is successful, then the node should be up and running as a Replica of the new Primary.

The recovery process can fail if the THL data and dataserver contents do not match, for example when statements have been executed on a Replica. For information on recovering from failures that recover cannot fix, see Section 6.6.1.3, “Replica Datasource Extended Recovery” .

Prev	Up	Next
6.5. Switching Primary Hosts	^Level	6.5.2. Manual Primary Switch

Continuent Documentation

6.5.1. Automatic Primary Failover