8.13. Configuring Parallel Replication

The replication stream within MySQL is by default executed in a single-threaded execution model. Using Tungsten Replicator, the application of the replication stream can be applied in parallel. This improves the speed at which the database is updated and helps to reduce the effect of slaves lagging behind the master which can affect application performance. Parallel replication operates by distributing the events from the replication stream from different database schemas in parallel on the slave. All the events in one schema are applied in sequence, but events in multiple schemas can be applied in parallel. Parallel replication will not help in those situations where transactions operate across schema boundaries.

Parallel replication supports two primary options:

  • Number of parallel channels — this configures the maximum number of parallel operations that will be performed at any one time. The number of parallel replication streams should match the number of different schemas in the source database, although it is possible to exhaust system resources by configuring too many. If the number of parallel threads is less than the number of schemas, events are applied in a round-robin fashion using the next available parallel stream.

  • Parallelization type — the type of parallelization to be employed. The disk method is the recommended solution.

Parallel replication can be enabled during installation by setting the appropriate options during the initial configuration and installation. To enable parallel replication after installation, you must configure each host as follows:

  1. Put the replicator offline:

    shell> trepctl offline
  2. Reconfigure the replication service to configure the parallelization:

    shell> tpm update firstrep --host=host2 \
        --channels=5 --svc-parallelization-type=disk
  3. Then restart the replicator to enable the configuration:

    shell> replicator restart
    Stopping Tungsten Replicator Service...
    Stopped Tungsten Replicator Service.
    Starting Tungsten Replicator Service...

The current configuration can be confirmed by checking the channels configured in the status information:

shell> trepctl status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : mysql-bin.000005:0000000000004263;0
appliedLastSeqno       : 1416
appliedLatency         : 1.0
channels               : 5
...

More detailed information can be obtained by using the trepctl status -name stores command, which provides information for each of the parallel replication queues:

shell> trepctl status -name stores 
Processing status command (stores)...
NAME                      VALUE
----                      -----
activeSeqno             : 0
doChecksum              : false
flushIntervalMillis     : 0
fsyncOnFlush            : false
logConnectionTimeout    : 28800
logDir                  : /opt/continuent/thl/firstrep
logFileRetainMillis     : 604800000
logFileSize             : 100000000
maximumStoredSeqNo      : 1416
minimumStoredSeqNo      : 0
name                    : thl
readOnly                : false
storeClass              : com.continuent.tungsten.replicator.thl.THL
timeoutMillis           : 2147483647
NAME                      VALUE
----                      -----
criticalPartition       : -1
discardCount            : 0
estimatedOfflineInterval: 0.0
eventCount              : 0
headSeqno               : -1
intervalGuard           : AtomicIntervalGuard (array is empty)
maxDelayInterval        : 60
maxOfflineInterval      : 5
maxSize                 : 10
name                    : parallel-queue
queues                  : 5
serializationCount      : 0
serialized              : false
stopRequested           : false
store.0                 : THLParallelReadTask task_id=0 thread_name=store-thl-0 »
    hi_seqno=0 lo_seqno=0 read=0 accepted=0 discarded=0 events=0
store.1                 : THLParallelReadTask task_id=1 thread_name=store-thl-1 »
    hi_seqno=0 lo_seqno=0 read=0 accepted=0 discarded=0 events=0
store.2                 : THLParallelReadTask task_id=2 thread_name=store-thl-2 »
    hi_seqno=0 lo_seqno=0 read=0 accepted=0 discarded=0 events=0
store.3                 : THLParallelReadTask task_id=3 thread_name=store-thl-3 »
    hi_seqno=0 lo_seqno=0 read=0 accepted=0 discarded=0 events=0
store.4                 : THLParallelReadTask task_id=4 thread_name=store-thl-4 »
    hi_seqno=0 lo_seqno=0 read=0 accepted=0 discarded=0 events=0
storeClass              : com.continuent.tungsten.replicator.thl.THLParallelQueue
syncInterval            : 10000
Finished status command (stores)...

To examine the individual threads in parallel replication, you can use the trepctl status -name shards status option, which provides information for each individual shard thread:

Processing status command (shards)...
NAME                VALUE
----                -----
appliedLastEventId: mysql-bin.000005:0000000013416909;0
appliedLastSeqno  : 1432
appliedLatency    : 0.0
eventCount        : 28
shardId           : cheffy
stage             : q-to-dbms
...
Finished status command (shards)...