5.5. Switching Master Hosts

The master host within a dataservice can be switched, either automatically, or manually. Automatic switching occurs when the dataservice is in the AUTOMATIC policy mode, and a failure in the underlying datasource has been identified. The automatic process is designed to keep the dataservice running without requiring manual intervention.

Manual switching of the master can be performed during maintenance operations, for example during an upgrade or dataserver modification. In this situation, the master must be manually taken out of service, but without affecting the rest of the dataservice. By switching the master to another datasource in the dataservice, the original master can be put offline, or shunned, while maintenance occurs. Once the maintenance has been completed, the datasource can be re-enabled, and either remain as the a slave, or switched back as the master datasource.

Switching a datasource, whether automatically or manually, occurs while the dataservice is running, and without affecting the operation of the dataservice as a whole. Client application connections through Tungsten Connector are automatically reassigned to the datasources in the dataservice, and application operation will be unaffected by the change. Switching the datasource manually requires a single command that performs all of the required steps, monitoring and managing the switch process.

Switching the master, manually or automatically, performs the following steps within the dataservice:

  1. Set the master node to offline state. New connections to the master are rejected, and writes to the master are stopped.

  2. On the slave that will be promoted, switch the datasource offline. New connections are rejected, stopping reads on this slave.

  3. Kill any outstanding client connections to the master data source, except those belonging to the tungsten account.

  4. Send a heartbeat transaction between the master and the slave, and wait until this transaction has been received. Once received, the THL on master and slave are up to date.

  5. Perform the switch:

    • Configure all remaining replicators offline

    • Configure the selected slave as the new master.

    • Set the new master to the online state.

    • New connections to the master are permitted.

  6. Configure the remaining slaves to use the new master as the master datasource.

  7. Update the connector configurations and enable client connections to connect to the masters and slaves.

The switching process is monitored by Tungsten Cluster, and if the process fails, either due to a timeout or a recoverable error occurs, the switch operation is rolled back, returning the dataservice to the original configuration. This ensures that the dataservice remains operational. In some circumstances, when performing a manual switch, the command may need to be repeated to ensure the requested switch operation completes.

The process takes a finite amount of time to complete, and the exact timing and duration will depend on the state, health, and database activity on the dataservice. The actual time taken will depend on how up to date the slave being promoted is compared to the master. The switch will take place regardless of the current status after a delay period.