The Tungsten Manager is responsible for monitoring and managing a Tungsten Cluster dataservice. The manager has a number of control and supervisory roles for the operation of the cluster, and acts both as a control and a central information source for the status and health of the dataservice as a whole.
Primarily, the Tungsten Manager handles the following tasks:
Monitors the replication status of each datasource within the cluster.
Communicates and updates Tungsten Connector with information about the status of each datasource. In the event of a change of status, Tungsten Connectors are notified so that queries can be redirected accordingly.
Manages all the individual components of the system. Using the Java JMX system the manager is able to directly control the different components to change status, control the replication process, and
Checks to determine the availability of datasources by using either the system ping protocol (default), or using the Echo TCP/IP protocol on port 7 to determine whether a host is available. The configuration of the protocol to be used can be made by adjusting the manager properties. For more information, see Section B.2.3.3, “Host Availability Checks”.
Includes an advanced rules engine. The rule engine is used to respond to different events within the cluster and perform the necessary operations to keep the dataservice in optimal working state. During any change in status, whether user-selected or automatically triggered due to a failure, the rules are used to make decisions about whether to restart services, swap Primaries, or reconfigure connectors.
In order to be able to avoid split brain, a cluster needs an odd number of members such that if there is a network partition, there's always a chance that a majority of the members are in one of the network partitions. If there is not a majority, it's not possible to establish a quorum and the partition with the Primary, and no majority, will end up with a shunned Primary until such time a quorum is established.
To operate with an even number of database nodes, a witness node is required, preferably an active witness, since the dynamics of establishing a quorum are more likely to succeed with an active witness than with a passive witness.
Did you ever wonder just what the Tungsten Manager is thinking when it does an automatic failover or a manual switch in a cluster?
What factors are taken into account by the Tungsten Manager it picks a replica to fail over to?
This page will detail the steps the Tungsten Manager to perform a switch or failover.
This section covers both the process and some possible reasons why that process might not complete, along with best practices and ways to monitor the cluster for each situation.
When we say “role” in the context of a cluster datasource, we are talking about the view of a database node from the Tungsten Manager's perspective.
These roles apply to the node datasource at the local (physical) cluster level, and to the composite datasource at the composite cluster level.
Possible roles are:
Primary
A database node which is writable, or
A composite cluster which is active (contains a writable primary)
Relay
A read-only database node which pulls data from a remote cluster and shares it with downstream replicas in the same cluster
Replica
A read-only database node which pulls data from a local-cluster primary node, or from a local-cluster relay node for passive composite clusters
A composite cluster which is passive (contains a relay but NO writable primary)
One of the great powers of the Tungsten Cluster is that the roles for both cluster nodes and composite cluster datasources can be moved to another node or cluster, either at will via the cctrl> switch command, or by having an automatic failover invoked by the Tungsten Manager layer.
Please note that while failovers are normally automatic and triggered by the Tungsten Manager, a failover can be also be invoked manually via the cctrl command if ever needed.
There are key differences between the manual switch and automatic failover operations:
Switch
Switch attempts to perform the operation as gracefully as possible, so there will be a delay as all of the steps are followed to ensure zero data loss
When the switch sub-command is invoked within cctrl, the Manager will cleanly close connections and ensure replication is caught up before moving the Primary role to another node
Switch recovers the original Primary to be a Replica
Please see Section 6.5.2, “Manual Primary Switch”.
Failover
Failover is immediate, and could possibly result in data loss, even though we do everything we can to get all events moved to the new Primary
Failover leaves the original primary in a SHUNNED state
Connections are closed immediately
Use the cctrl> recover command to make the failed Primary into a Replica once it is healthy
Please see both Section 6.5.1, “Automatic Primary Failover” and Section 8.2, “Tungsten Manager Failover Behavior”
For even more details, please visit: Section 6.5, “Switching Primary Hosts”
Picking a target replica node from a pool of candidate database replicas involves several checks and decisions.
For switch commands for both physical and composite services, the user has the ability to pass in the name of the physical or composite replica that is to be the target of the switch. If no target is passed in, or if the operation is an automatic failover, then the Manager has logic to identify the 'most up to date' replica which then becomes the target of the switch or failover.
Here are the choices to pick a new primary database node from available replicas, in order:
Skip any replica that is either not online or that is not a standby replica.
Skip any replica that has its status set to ARCHIVE
Skip any replica that does not have an online manager.
Skip any replica that does not have a replicator in either online or synchronizing state.
Now we have a target datasource prospect...
By comparing the last applied sequence number of the current target datasource prospect to any other previously seen prospect, we should eventually end up with a replica that has the highest applied sequence number. We also save the prospect that has the highest stored sequence number.
If we find that there is a tie in the highest sequence number that has been applied or stored by any prospect with another prospect, we compare the datasource precedence and if there's a difference in this precedence, we choose the datasource with the lowest precedence number i.e. a precedence of 1 is higher than a precedence of 2. If we have a tie in precedence, select the last replica chosen and discard the replica currently being evaluated.
After we have evaluated all of the replicas, we will either have a single winner or we may have a case where we have one replica that has the highest applied sequence number but we have another replica that has the highest stored sequence number i.e. it has gotten the most number of THL records from the primary prior to the switch operation. In this case, and this is particularly important in cases of failover, we choose the replica that has the highest number of stored THL records.
Skip any replica that has a latency higher than the configured threshold. If too far behind, do not use that replica. The tpm option --property=policy.slave.promotion.latency.threshold=900 controls the check, with 900 seconds as the default value.
At this point return to the switch or failover command whatever target replica we have chosen so that the operation can proceed.
After looping over all available replicas, check the selected target Replica’s appliedLatency to see if it is higher than the configured threshold (default: 900 seconds). If the appliedLatency is too far behind, do not use that Replica.
The tpm
option --property=policy.slave.promotion.latency.threshold=900
controls the check.
If no viable Replica is found (or if there is no available Replica to begin with), there will be no switch or failover at this point.
For more details on automatic failover versus manual switch, please visit: Section 8.4.1, “Manual Switch Versus Automatic Failover”
For more details on switch and failover steps for local clusters, please visit:
For more details on switch and failover steps for composite services, please visit:
What are the best practices for ensuring the cluster always behaves as expected? Are there any reasons for a cluster NOT to fail over? If so, what are they?
Here are three common reasons that a cluster might not failover properly:
Policy Not Automatic
BEST PRACTICE: Ensure the cluster policy is automatic unless you specifically need it to be otherwise
SOLUTION: Use the check_tungsten_policy command to verify the policy status
Complete Network Partition
If the nodes are unable to communicate cluster-wide, then all nodes will go into a FailSafe-Shun mode to protect the data from a split-brain situation.
BEST PRACTICE: Ensure that all nodes are able to see each other via the required network ports
SOLUTION: Verify that all required ports are open between all nodes local and remote - see Section B.2.3.1, “Network Ports”
SOLUTION: Use the check_tungsten_online command to check the DataSource State on each node
No Available Replica
BEST PRACTICE: Ensure there is at least one
ONLINE
node that is not in
STANDBY
or ARCHIVE
mode
SOLUTION: Use the check_tungsten_online command to check the DataSource State on each node
BEST PRACTICE: Ensure that the Manager is running on all nodes
SOLUTION: Use the check_tungsten_services command to verify that the Tungsten processes are running on each node
BEST PRACTICE: Ensure all Replicators are either ONLINE or GOING ONLINE:SYNCHRONIZING
SOLUTION: Use the check_tungsten_online command to verify that the Replicator (and Manager) is ONLINE on each node
BEST PRACTICE: Ensure the replication applied latency is under the threshold, default 900 seconds
SOLUTION: Use the check_tungsten_latency command to check the latency on each node
Below are examples of all the health-check tools listed above:
shell>check_tungsten_services -c -r
CRITICAL: Connector, Manager, Replicator are not running shell>startall
Starting Replicator normally Starting Tungsten Replicator Service... Waiting for Tungsten Replicator Service....... running: PID:14628 Starting Tungsten Manager Service... Waiting for Tungsten Manager Service.......... running: PID:15143 Starting Tungsten Connector Service... Waiting for Tungsten Connector Service....... running: PID:15513 shell>check_tungsten_services -c -r
OK: All services (Connector, Manager, Replicator) are running
shell>check_tungsten_policy
CRITICAL: Manager is not running shell>manager start
shell>check_tungsten_policy
CRITICAL: Policy is MAINTENANCE shell>cctrl
cctrl>set policy automatic
cctrl>exit
shell>check_tungsten_policy
OK: Policy is AUTOMATIC
shell>check_tungsten_latency -w 100 -c 200
CRITICAL: Manager is not running shell>manager start
shell>check_tungsten_latency -w 100 -c 200
CRITICAL: db8=65107.901s, db9 is missing latency information shell>cctrl
cctrl>cluster heartbeat
cctrl>exit
shell>check_tungsten_latency -w 100 -c 200
WARNING: db9 is missing latency information shell>cctrl
cctrl>set policy automatic
cctrl>exit
shell>check_tungsten_latency -w 100 -c 200
OK: All replicas are running normally (max_latency=4.511)
shell>check_tungsten_online
CRITICAL: Manager is not running shell>manager start
shell>check_tungsten_online
CRITICAL: Replicator is not running shell>replicator start
shell>check_tungsten_online
CRITICAL: db9 REPLICATION SERVICE north is not ONLINE shell>trepctl online
shell>check_tungsten_online
CRITICAL: db9 REPLICATION SERVICE north is not ONLINE
Related Pages: