8.3. Checking Replication Status

To check the replication status you can use the trepctl command. This accepts a number of command-specific verbs that provide status and control information for your configured cluster. The basic format of the command is:

shell> trepctl [-host hostname] command

The -host option is not required, and enables you to check the status of a different host than the current node.

To get the basic information about the currently configured services on a node and current status, use the services verb command:

shell> trepctl services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 211
appliedLatency  : 17.66
role            : slave
serviceName     : firstrep
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

In the above example, the output shows the last sequence number and latency of the host, in this case a slave, compared to the master from which it is processing information. In this example, the last sequence number and the latency between that sequence being processed on the master and applied to the slave is 17.66 seconds. You can compare this information to that provided by the master, either by logging into the master and running the same command, or by using the host command-line option:

shell> trepctl -host host1 services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 365
appliedLatency  : 0.614
role            : master
serviceName     : firstrep
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

By comparing the appliedLastSeqno for the master against the value on the slave, it is possible to determine that the slave and the master are not yet synchronized.

For a more detailed output of the current status, use the status command, which provides much more detailed output of the current replication status:

shell> trepctl status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : mysql-bin.000064:0000000002757461;0
appliedLastSeqno       : 212
appliedLatency         : 263.43
channels               : 1
clusterName            : default
currentEventId         : NONE
currentTimeMillis      : 1365082088916
dataServerHost         : host2
extensions             : 
latestEpochNumber      : 0
masterConnectUri       : thl://host1:2112/
masterListenUri        : thl://host2:2112/
maximumStoredSeqNo     : 724
minimumStoredSeqNo     : 0
offlineRequests        : NONE
pendingError           : NONE
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: NONE
pipelineSource         : thl://host1:2112/
relativeLatency        : 655.915
resourcePrecedence     : 99
rmiPort                : 10000
role                   : slave
seqnoType              : java.lang.Long
serviceName            : firstrep
serviceType            : local
simpleServiceName      : firstrep
siteName               : default
sourceId               : host2
state                  : ONLINE
timeInStateSeconds     : 893.32
uptimeSeconds          : 9370.031
version                : Tungsten Replicator 4.0.7 build 696
Finished status command...

Similar to the host specification, trepctl provides information for the default service. If you have installed multiple services, you must specify the service explicitly:

shell> trepctrl -service servicename status

If the service has been configured to operate on an alternative management port, this can be specified using the -port option. The default is to use port 10000.

The above command was executed on the slave host, host2. Some key parameter values from the generated output:

  • appliedLastEventId

    This shows the last event from the source event stream that was applied to the database. In this case, the output shows that source of the data was a MySQL binary log. The portion before the colon, mysql-bin.000064 is the filename of the binary log on the master. The portion after the colon is the physical location, in bytes, within the binary log file.

  • appliedLastSeqno

    The last sequence number for the transaction from the Tungsten stage that has been applied to the database. This indicates the last actual transaction information written into the slave database.

    When using parallel replication, this parameter returns the minimum applied sequence number among all the channels applying data.

  • appliedLatency

    The appliedLatency is the latency between the commit time and the time the last committed transaction reached the end of the corresponding pipeline within the replicator.

    In replicators that are operating with parallel apply, appliedLatency indicates the latency of the trailing channel. Because the parallel apply mechanism does not update all channels simultaneously, the figure shown may trail significantly from the actual latency.

  • masterConnectUri

    On a master, the value will be empty.

    On a slave, the URI of the master Tungsten Replicator from which the transaction data is being read from. The value supports multiple URIs (separated by comma) for topologies with multiple masters.

  • maximumStoredSeqNo

    The maximum transaction ID that has been stored locally on the machine in the THL. Because Tungsten Replicator operates in stages, it is sometimes important to compare the sequence and latency between information being ready from the source into the THL, and then from the THL into the database. You can compare this value to the appliedLastSeqno, which indicates the last sequence committed to the database. The information is provided at a resolution of milliseconds.

  • pipelineSource

    Indicates the source of the information that is written into the THL. For a master, pipelineSource is the MySQL binary log. For a slave, pipelineSource is the THL of the master.

  • relativeLatency

    The relativeLatency is the latency between now and timestamp of the last event written into the local THL. An increasing relativeLatency indicates that the replicator may have stalled and stopped applying changes to the dataserver.

  • state

    Shows the current status for this node. In the event of a failure, the status will indicate that the node is in a state other than ONLINE. The timeInStateSeconds will indicate how long the node has been in that state, and therefore how long the node may have been down or unavailable.

The easiest method to check the health of your cluster is to compare the current sequence numbers and latencies for each slave compared to the master. For example:

shell> trepctl -host host2 status|grep applied
appliedLastEventId     : mysql-bin.000076:0000000087725114;0
appliedLastSeqno       : 2445
appliedLatency         : 252.0
...
shell> trepctl -host host1 status|grep applied
appliedLastEventId     : mysql-bin.000076:0000000087725114;0
appliedLastSeqno       : 2445
appliedLatency         : 2.515

Note

For parallel replication and complex multi-service replication structures, there are additional parameters and information to consider when checking and confirming the health of the cluster.

The above indicates that the two hosts are up to date, but that there is a significant latency on the slave for performing updates.