5.3.4. Understanding Datasource States

All datasources will be in one of a number of states that indicate their current operational status.

5.3.4.1. ONLINE State

A datasource in the ONLINE state is considered to be operating normally, with replication, connector and other traffic being handled as normal.

5.3.4.2. SHUNNED State

A SHUNNED datasource implies that the datasource is OFFLINE. Unlike the OFFLINE state, a SHUNNED datasource is not automatically recovered.

A datasource in a SHUNNED state is not connected or actively part of the dataservice. Individual services can be reconfigured and restarted. The operating system and any other maintenance to be performed can be carried out while a host is in the SHUNNED state without affecting the other members of the dataservice.

Datasources can be manually or automatically shunned. The current reason for the SHUNNED state is indicated in the status output. For example, in the sample below, the node host3 was manually shunned for maintenance reasons:

...
+----------------------------------------------------------------------------+
|host3(slave:SHUNNED(MANUALLY-SHUNNED), progress=157454, latency=1.000)      |
|STATUS [SHUNNED] [2013/05/14 05:12:52 PM BST]                               |
...

5.3.4.3. OFFLINE State

A datasource in the OFFLINE does not accept connections through the connector for either reads or writes.

When the dataservice is in the AUTOMATIC policy mode, a datasource in the OFFLINE state is automatically recovered and placed into the ONLINE state. If this operation fails, the datasource remains in the OFFLINE state.

When the dataservice is in MAINTENANCE or MANUAL policy mode, the datasource will remain in the OFFLINE state until the datasource is explicitly switched to the ONLINE state.

5.3.4.4. FAILED State

When a datasource fails, for example when a failure in one of the services for the datasource stops responding or fails, the datasource will be placed into the FAILED state. In the example below, the underlying dataserver has failed:

+----------------------------------------------------------------------------+
|host3(slave:FAILED(DATASERVER 'host3@alpha' STOPPED),                       |
|progress=154146, latency=31.419)                                            |
|STATUS [CRITICAL] [2013/05/10 11:51:42 PM BST]                              |
|REASON[DATASERVER 'host3@alpha' STOPPED]                                    |
+----------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                     |
|  REPLICATOR(role=slave, master=host1, state=ONLINE)                        |
|  DATASERVER(state=STOPPED)                                                 |
|  CONNECTIONS(created=208, active=0)                                        |
+----------------------------------------------------------------------------+

For a FAILED datasource, the recover command within cctrl can be used to attempt to recover the datasource to the operational state. If this fails, the underlying fault must be identified and addressed before the datasource is recovered.