Tungsten Clustering (for MySQL) 7.1 Manual

Question

8.2.5.3.1.

Does an ICMP packet drop relate to an invalid membership issue in Tungsten?

Answer 1

No. Only a MemberHeartbeatGap (not getting a heartbeat reply for 10 seconds) will trigger a MembershipInvalidAlarm.

Answer 2

Each Manager independently invokes the fail-safe script if they are alone in the quorum.

Answer 3

This can be seen only from your network monitoring logs. In the Manager logs we can see only the response to the "ping -c 1 -w 5 hostname" result during the membership validity check when we increment the timer, but during this 10-second period the network can be flapping (up-down-up-down), which is NOT visible from the Manager logs. What we can see during the membership validity check is that the network at that exact time was either up or down.

Answer 4

Failover can happen for the following reasons:

if the primary MySQL server goes down. This has nothing to do with our case.
if the host with the primary MySQL server goes down. This has nothing to do with our case.
If the host with the primary MySQL server gets isolated AND the other two hosts can still form a majority of the quorum. This has nothing to do with our case.
If the host with the primary MySQL server gets isolated AND also the other hosts get isolated from each other. In this case failover does not happen (we have only singular managers isolated from each other) and the cluster will go to failsafe shun. This is our case.

This means that setting the threshold to 6 will only result in keeping the primary online to resist the network blip. Data loss after a failover can happen ONLY if the operator of the cluster issues a recover command in force mode without paying any attention to the error displayed. Error is displayed when there are transactions left in the primary binlog that were not replicated to the new Primary. In this case the recover command will issue an error and will not proceed. The operator needs to manually apply the transactions to the new Primary. If the operator forces the recover, data loss will occur.

Answer 5

The message "Non-validated database members" in the Manager's tmsvc.log means that there wasn't a Manager on that host up and running or because of the network breakage that Manager is not reachable.

Answer 6

Automatic failover is initiated by the coordinator. For a coordinator to be able to failover, it must be in a partition that has a majority of nodes. If every Manager is in its own single partition, no failover can be expected. Only a FAILSAFE-SHUN can happen at this point.

Answer 7

No. A FAILSAFE-SHUN will only happen if the Manager is alone in the JGroups partition.

Answer 8

Tungsten initiates a FAILSAFE-SHUN when a MemberHearbeatGap invokes a MembershipInvalidAlarm. Only a MembershipInvalidAlarm can cause a FAILSAFE-SHUN, and only if the host is alone in the group.

Answer 9

FAILSAFE-SHUN can happen even if there wasn't a ping failure (e.g. the host is up and running) but the other Managers are down (stopped, restarted).

Answer 10

Not recommended - ICMP is the best practice as seen over years of experience in the field.

When ICMP is failing, TCP will fail too. If a network is unstable, using TCP to somehow mask the instability is not the correct approach to checking network health for a database cluster.

Our implementation of JGroups uses TCP to communicate with the group members and the ping utility uses ICMP.

Answer 11

The correct way to analyze the Manager logs is to follow the provided example above. Start by locating the triggered events and then follow the included explanations.

Grep for HEARTBEAT GAP DETECTED FOR MEMBER and look for which events trigger other events.

Also, JGroups logging is disabled by default. The best practice is to keep JGroups logging disabled because it will rapidly fill the logs and is very difficult to interpret.

Answer 12

Heartbeat events are sent via TCP

Answer 13

Our jGroups implementation uses TCP to communicate.

Answer 14

GC VIEW OF CURRENT DB MEMBERS IS: db1, db2, db3 - this is how the group communication (JGroups GC) sees members
VALIDATED DB MEMBERS ARE: db1, db3 - managers that are up and running and respond to the pingMember command (TCP)
REACHABLE DB MEMBERS ARE: db1, db3, db2 - host reachable by ping (host ping utility ICMP)

Answer 15

You need to specify two options: thl-port to set the Replica THL listener port and master-thl-port to define the upstream Primary THL listener port. Otherwise thl-port alone sets BOTH.

Answer 16

To update the IP address used by one or more hosts in your cluster, you must perform the following steps:

If possible, switch the node into SHUNNED mode.
Reconfigure the IP address on the machine.
Update the hostname lookup, for example, by editing the IP configuration in /etc/hosts.
Restart the networking to reconfigure the service.
On the node that has changed IP address, run the following from the software staging directory:
```
shell> tools/tpm update --replace-release
```
On all other nodes within the cluster:
1. Update the hostname lookup for the new node, for example, by updating the IP configuration in /etc/hosts.
2. Place the cluster into MAINTENANCE mode.
3. Execute the following from the software staging directory:
```
shell> tools/tpm update --replace-release
```

Answer 17

If you need to change the password used by Tungsten Cluster to connect to a dataserver and apply changes, the password can be updated first by changing the information within the your dataserver, and then by updating the configuration using tpm update. The new password is not checked until the Tungsten Replicator process is starting. Changing the password and then updating the configuration will keep replication from failing.

Within cctrl set the MAINTENANCE policy mode:
```
cctrl> set policy maintenance
```
Within MySQL, update the password for the user, allowing the change to be replicated to the other datasources:
```
mysql> SET PASSWORD FOR tungsten@'%' = PASSWORD('new_pass');
```
Update the values for datasource-password=new_pass in your /etc/tungsten/tungsten.ini and issue tpm update
Set the policy mode in cctrl back to AUTOMATIC :
```
cctrl> set policy automatic
```

Answer 18

The most likely culprit for this issue is that the time is different on the machine in question. If you have ntp or a similar network time tool installed on your machine, use it to update the current time across all the hosts within your deployment:

shell> ntpdate pool.ntp.org

Once the command has been executed across all the hosts, trying sending a heartbeat on the Primary to Replicas and checking the latency:

shell> trepctl heartbeat

Answer 19

Both filters replicate.do and replicate.ignore will either do or ignore both DML and DDL

DDL is currently ONLY replicated for MySQL to MySQL topologies, or within MySQL Clusters, although it would be advisable not to use ignore/do filters in a clustered environment where data/structural integrity is key.

With replicate.do, all DML and DDL will be replicated ONLY for any database or table listed as part of the do filter.

With replicate.ignore, all DML and DDL will be replicated except for any database or table listed as part of the ignore filter.

Answer 20

You can change the configuration by adding/adjusting the following property in the /etc/tungsten/tungsten.ini and then issuing tpm update

java-mem-size=2048

Answer 21

The use of triggers can cause many issues with replication, and the effects can differ between different binlog-format settings (Such as ROW vs STATEMENT)

For a full explanation see Section C.4.1, “Triggers”

Answer 22

This is a normal deployment pattern for working in AWS reduce risk. A single cluster works quite well in this topology.

Answer 23

Standard settings work out of the box. Fine tuning can be done by working with the specific customer application during a Proof-Of-Concept or Production roll-out.

Answer 24

This is not something currently supported.

Continuent Term	Traditional Term	Description
composite dataservice	Multi-Site Cluster	A configured Tungsten Cluster service consisting of multiple dataservices, typically at different physical locations.
dataservice	Cluster	The collection of machines that make up a single Tungsten Dataservice. Individual hosts within the dataservice are called datasources. Each dataservice is identified by a unique name, and multiple dataservices can be managed from one server.
dataserver	Database	The database on a host.
datasource	Host or Node	One member of a dataservice and the associated Tungsten components.
staging host	-	The machine (and directory) from which Tungsten Cluster™ is installed and configured. The machine does not need to be the same as any of the existing hosts in the dataservice.
active witness	-	A machine in the dataservice that runs the manager process but is not running a database server. This server will be used to establish quorum in the event that a datasource becomes unavailable.
coordinator		The datasource or active witness in a dataservice that is responsible for making decisions on the state of the dataservice. The coordinator is usually the member that has been running the longest. It will not always be the Primary. When the manager process on the coordinator is stopped, or no longer available, a new coordinator will be chosen from the remaining members.

Tungsten Term	Traditional Term	Description
composite dataservice	Multi-Site Cluster	A configured Tungsten Cluster service consisting of multiple dataservices, typically at different physical locations.
dataservice	Cluster	A configured Tungsten Cluster service consisting of dataservers, datasources and connectors.
dataserver	Database	The database on a host. Datasources include MySQL, PostgreSQL or Oracle.
datasource	Host or Node	One member of a dataservice and the associated Tungsten components.
staging host	-	The machine from which Tungsten Cluster is installed and configured. The machine does not need to be the same as any of the existing hosts in the cluster.
staging directory	-	The directory where the installation files are located and the installer is executed. Further configuration and updates must be performed from this directory.
connector	-	A connector is a routing service that provides management for connectivity between application services and the underlying dataserver.
Active Witness host	-	A witness host is a host that runs the manager process but is not running a database server. This server will be used to establish quorum in the event that a datasource becomes unavailable

	Tungsten Cluster Service Seqno		Tungsten Replicator Service Seqno
Operation	`east`	`west`	`east`	`west`
Insert/update data on `east`	Seqno Increment		Seqno Increment
Insert/update data on `west`		Seqno Increment		Seqno Increment

Command	Description
trepctl status	Shows basic variables including overall latency of Replica and number of apply channels
trepctl status -name shards	Shows the number of transactions for each shard
trepctl status -name stores	Shows the configuration and internal counters for stores between tasks
trepctl status -name tasks	Shows the number of transactions (events) and latency for each independent task in the replicator pipeline

Role	Supplies Replication Data	Receives Replication Data	Load Balancing	Failover
`master`	Yes	No	Yes	Yes
`slave`	No	Yes	Yes	Yes
`relay`	Yes	Yes	Yes	Yes
`standby`	No	Yes	No	Yes
`archive`	No	Yes	Yes	No

Operation	State
Node operating normally	`ONLINE`
Administrator puts node into offline state	`GOING-OFFLINE`
Node is offline	`OFFLINE:NORMAL`
Administrator puts node into online state	`GOING-ONLINE:SYNCHRONIZING`
Node catches up with Extractor	`ONLINE`

Operation	State
Node operating normally	`ONLINE`
Failure causes the node to go offline	`OFFLINE:ERROR`
Administrator fixes error and puts node into online state	`GOING-ONLINE:SYNCHRONIZING`
Node catches up with Extractor	`ONLINE`

Datasource State	Alert STATUS
ONLINE	OK
OFFLINE	WARN (for non-composite datasources)
OFFLINE	DIMINISHED (for composite passive replica)
OFFLINE	CRITICAL (for composite active primary)
FAILED	CRITICAL
SHUNNED	SHUNNED

	Policy Mode
Ruleset	Automatic	Manual	Maintenance
Monitoring	Yes	Yes	Yes
Fault Detection	Yes	Yes	No
Failure Fencing	Yes	Yes	No
Failure Recovery	Yes	No	No

Path	Supported
ini, in place	Yes
ini, with Primary switch	No
Staging	No
Staging, with --no-connectors	No

Step	Description	Command	host1	host2	host3
1	Initial state		Primary	Replica	Replica
2	Set the `MAINTENANCE` policy	set policy maintenance	Primary	Replica	Replica
3	Switch Primary	switch to host2	Replica	Primary	Replica
4	Shun `host1`	datasource host1 shun	Shunned	Primary	Replica
5	Perform maintenance		Shunned	Primary	Replica
6	Recover the Replica ( `host1`)	datasource host1 recover	Replica	Primary	Replica
7	Ensure the Replica has caught up		Replica	Primary	Replica
8	Switch Primary back to `host1` (optional)	switch to host1	Primary	Replica	Replica
9	Set `AUTOMATIC` policy	set policy automatic	Primary	Replica	Replica

Exporter	Port	Description	Scope
node	9100	Metrics for the underlying node "hardware"	External
mysql	9104	Metrics for the MySQL server	External
replicator	8091	Metrics for the Tungsten Replicator	Internal
manager	8092	Metrics for the Tungsten Manager	Internal
connector	8093	Metrics for the Tungsten Connector	Internal

Metric	Description
`tungsten_replicator_version`	The Tungsten software version number
`tungsten_replicator_service`	Replicator service value
`tungsten_replicator_seqno`	Replicator min/max/current sequence number value
`tungsten_replicator_latency`	Replicator applied/relative latency value

	Routing Method	QoS	Latency	Affinity
Global Configuration	Yes	Implied	Yes	Yes
Connection String	Yes	Yes	Yes	Yes
`user.map`	Yes	Yes	Yes	Yes
SQL statement	Yes (with SQL routing enabled)	Yes (with SQL routing enabled)	No	No

Routing Method	Host Selection	Auto R/W Splitting	Replica Latency	Maximum Applied Latency
Smartscale	By Session	Yes (by SQL statement)	Lazy	Yes
Direct Reads	By Content	Yes (by SQL statement)	Lazy	Yes
Host-based	By Hostname	No	Yes	Yes
Port-based	By Network Port	No	No	Yes
SQL-based	By SQL comment	No	No	Yes

QoS	Primary Selected	Replica Selected
`RW_STRICT`	Yes, always	No.
`RO_RELAXED`	Only if no Replica available	Yes, if below max applied latency.

Load Balancer	Default QoS	Description
DefaultLoadBalancer	`RW_STRICT`	Always selects the Primary data source
MostAdvancedSlaveLoadBalancer	`RO_RELAXED`	Selects the Replica that has replicated the most events, by comparing data sources "high water" marks. If no Replica is available, the Primary will be returned.
LowestLatencySlaveLoadBalancer		Selects the Replica data source that has the lowest replication lag, or `appliedLatency` in ls -l within cctrl output. If no Replica data source is eligible, the Primary data source will be selected.
RoundRobinSlaveLoadBalancer		Selects a Replica in a round robin manner, by iterating through them using internal index. Returns the Primary if no Replica is found online
HighWaterSlaveLoadBalancer	`RW_SESSION`	Given a session high water (usually the high water mark of the update event), selects the first Replica that has higher or equal high water, or the Primary if no Replica is online or has replicated the given session event. This is the default used when SmartScale is enabled.

Auto Read/Write Splitting	Yes
Primary Selection	Automatically, by SQL examination
Replica Selection	Automatically, by SQL examination
QoS Compatibility	None
SmartScale Compatibility	None

Auto Read/Write Splitting	No
Primary Selection	Manually, by SQL comments
Replica Selection	Manually, by SQL comments
QoS Compatibility	Supported
SmartScale Compatibility	Yes
Direct Compatibility	Yes

Auto Read/Write Splitting	No
Primary Selection	Manually, by hostname/IP address
Replica Selection	Manually, by hostname/IP address
QoS Compatibility	None
SmartScale Compatibility	None

Auto Read/Write Splitting	No
Primary Selection	Manually, by network port
Replica Selection	Manually, by network port
QoS Compatibility	None
SmartScale Compatibility	None

Feature	Proxy Mode	Bridge Mode
Primary/Replica Selection	Yes	Yes
Switch/Failover	Yes	Yes
Automatic Read/Write Splitting	Yes	No
Application-based Read/Write Splitting	Yes	Yes
Seamless Reconnects	Yes	No
Data Source Selection	Current data source is checked to confirm latency and affinity	Pass-through
Session KeepAlive	Yes	No

Option	Description
`client-list`	Return a list of the current client connections through this connector.
`cluster-status`	Return the cluster status, as the connector currently understands it. This is the command-line alternative to the inline cluster status command.
`condrestart`	Restart only if already running
`console`	Launch in the current console (instead of a daemon)
`drain [seconds]`	An alias for graceful-stop. As of v7.0.0
`dump`	Request a Java thread dump (if connector is running)
`graceful-stop [seconds]`	Stops the connector gracefully, allowing outstanding open connections to finish and close before the connector process is stopped. All new connection requests are denied. The Connector will shut down as soon as there are no active connections. [seconds] is an integer specifying the optional time to wait before terminating the connector. Specifying no value for seconds will cause the Connector to wait indefinitely for all connections to finish. Specifying zero (0) seconds will cause the Connector to shut down immediately without waiting for existing connections to complete gracefully. As of v7.0.0, connector drain is available as an alias for connector graceful-stop.
`install`	Install the service to automatically start when the system boots
`mode`	Displays the mode the connector is running in, either "proxy" or "bridge"
`reconfigure`	Reconfigure the connector by forcing the connector to reread the configuration, including the configuration files and `user.map`.
`remove`	Remove the service from starting during boot
`restart`	Stop connector if already running and then start
`start`	Start in the background as a daemon process
`status`	Query the current status
`stop`	Stop if running (whether as a daemon or in another console) Optional timeout in seconds can be provided (From release 6.1.19 only)

Option	Description
tungsten cluster status	Displays a detailed view of the information the connector has about the cluster
tungsten connection count	Display the current number of active connection to each datasource
tungsten connection status	Displays information about the connection status for the last statement executed
tungsten flush privileges	Reload the user.map file and update the user credentials
tungsten gc	Executes the connector garbage collector to free memory
tungsten help	Shows help description each statement
tungsten mem info	Display the memory usage information for the connector
tungsten show processlist	List all active queries on this connector instance
tungsten show variables	Display the connector configuration options currently in use

Option	`-admin`
Description	Enter admin mode when connecting
Value Type	string

Option	`-expert`
Description	Enter expert mode when connecting
Value Type	string

Option	`-host`
Description	Host name of the service manager to use
Value Type	string
Default	localhost

Option	`-logical`
Description	Enter logical mode when connecting
Value Type	string

Option	`-multi`
Description	Allow support for connecting to multiple services
Value Type	string

Tungsten Clustering (for MySQL) 7.1 Manual

Continuent Ltd

Preface

1. Legal Notice

2. Conventions

3. Quickstart Guide

Chapter 1. Introduction

1.1. Tungsten Replicator

1.1.1. Transaction History Log (THL)

1.2. Tungsten Manager

1.3. Tungsten Connector

Chapter 2. Deployment

2.1. Host Types

2.1.1. Manager Hosts

2.1.2. Connector (Router) Hosts

2.1.3. Replicator Hosts

2.1.4. Active Witness Hosts

2.2. Deployment Sources

2.2.1. Using the TAR/GZipped files

2.2.2. Using the RPM package files

Note

2.3. Common tpm Options During Deployment

2.4. Best Practices

2.4.1. Best Practices: Deployment

2.4.2. Best Practices: Upgrade

2.4.3. Best Practices: Operations

2.4.4. Best Practices: Maintenance

Chapter 3. Deployment: MySQL Topologies

Important

3.1. Deploying Standalone HA Clusters

3.1.1. Prepare: Standalone HA Cluster

3.1.2. Install: Standalone HA Cluster

Note

Important

3.2. Deploying Composite Active/Passive Clustering

3.2.1. Prepare: Composite Active/Passive Cluster

3.2.2. Install: Composite Active/Passive Cluster

Note

3.2.3. Adding a remote Composite Cluster

Note

3.3. Deploying Composite Active/Active Clusters

Note

3.3.1. Prepare: Composite Active/Active Clusters

3.3.2. Install: Composite Active/Active Clusters

Note

Warning

3.3.3. Best Practices: Composite Active/Active Clusters

Note

Note

3.3.4. Resetting a single dataservice

Note

3.3.5. Resetting all dataservices

3.3.6. Dataserver maintenance

3.3.6.1. Fixing Replication Errors

3.3.6.1.1. Recovering Cross Site Services

3.3.7. Adding a Cluster to a Composite Active/Active Topology

3.3.7.1. Prerequisites

3.3.7.2. Backup and Restore

3.3.7.3. Update Existing Configuration

3.3.7.4. New Host Configuration

3.3.7.5. Install on new nodes

3.3.7.6. Update existing nodes

3.3.7.7. Start the new cluster

3.3.7.8. Validate and check

3.4. Deploying Composite Dynamic Active/Active

3.4.1. Enabling Composite Dynamic Active/Active

3.5. Deploying Multi-Site/Active-Active Clustering

Note

Note

3.5.1. Prepare: Multi-Site/Active-Active Clusters

3.5.2. Install: Multi-Site/Active-Active Clusters

Note

Warning

Note

3.5.3. Best Practices: Multi-Site/Active-Active Clusters

Note

Note

Note

Note

3.5.4. Resetting a single dataservice

Option	`-no-history`
Description	Disable command history
Value Type	string

Option	`-physical`
Description	Enter physical mode when connecting
Value Type	string

Option	`-port`
Description	Specify the TCP/IP port of the service manager
Value Type	string
Default	9997

Option	`-proxy`
Description	Operate as a proxy service
Value Type	string

Option	`-service`
Description	Connect to a specific service
Value Type	string

Option	`-timeout`
Description	Specify timeout (in seconds) to determine how long to wait before timing out when unable to connect to the manager. Default 30 seconds
Value Type	string

Option	Description
admin	Change to admin mode
cd	Change to a specific site within a multisite service
cluster	Issue a command across the entire cluster
cluster topology validate	Check, validate and report on current cluster topology and health.
cluster validate	Validate the cluster quorum configuration
create composite	Create a composite dataservice
datasource	Issue a command on a single datasource
expert	Change to expert mode
failover	Perform a failover operation from a primary to a replica
help	Display the help information
ls	Show cluster status
members	List the managers of the dataservice
physical	Enter physical mode
ping	Test host availability
quit, exit	Exit cctrl
recover master using	Recover the Primary within a datasource using the specified Primary
replicator	Issue a command on a specific replicator
router	Issue a command on a specific router (connector)
service	Run a service script
set	Set management options
set master	Set the Primary within a datasource
show topology	Shows the currently configured cluster topology
switch	Promote a Replica to a Primary

Option	Description
backup	Backup a datasource
connections	Displays the current number of connections running to the given node through connectors.
drain	Prevents new connection to be made to the given data source, while ongoing connection remain untouched. If a timeout (in seconds) is given, ongoing connections will be severed only after the timeout expires.
fail	Fail a datasource
offline	Put a datasource into the offline state
online	Put a datasource into the online state
recover	Recover a datasource into operation state as Replica
restore	Restore a datasource from a previous backup
shun	Shun a datasource
welcome	Welcome a shunned datasource back to the cluster

Option	Description
master	Configure a replicator as a master.
offline	Set replicator offline.
online	Set replicaotr online.
relay	Set the replicator as a relay of the supplied service/host.
restart	Restart the replicator process.
slave	Configure replicator as a slave of the current master.
start	Start the replicator process if it is not running.
status	Show the current replicator status.
stop	Stop the replicator process if it is running.

Option	Description
`-c`	Report a critical status if the latency is above this level
`--perslave-perfdata`	Show the latency performance information on a per-Replica basis
`--perfdata`	Show the latency performance information
`-w`	Report a warning status if the latency is above this level