Channels and Parallel Apply
Parallel apply works by using multiple threads for the final stage of the
replication pipeline. These threads are known as channels. Restart points
for each channel are stored as individual rows in table
trep_commit_seqno if you are
applying to a relational DBMS server, including MySQL, Oracle, and data
warehouse products like Vertica.
When you set the
--channels argument, the
tpm program configures the replication service to
enable the requested number of channels. A value of 1 results in
Do not change the number of channels without setting the replicator
offline cleanly. See the procedure later in this page for more
How Many Channels Are Enough?
Pick the smallest number of channels that loads the slave fully. For
evenly distributed workloads this means that you should increase channels
so that more threads are simultaneously applying updates and soaking up
I/O capacity. As long as each shard receives roughly the same number of
updates, this is a good approach.
For unevenly distributed workloads, you may want to decrease channels to
spread the workload more evenly across them. This ensures that each
channel has productive work and minimizes the overhead of updating the
channel position in the DBMS.
Once you have maximized I/O on the DBMS server leave the number of
channels alone. Note that adding more channels than you have shards does
not help performance as it will lead to idle channels that must update
their positions in the DBMS even though they are not doing useful work.
This actually slows down performance a little bit.
Affect of Channels on Backups
If you back up a slave that operates with more than one channel, say 30,
you can only restore that backup on another slave that operates with the
same number of channels. Otherwise, reloading the backup is the same as
changing the number of channels without a clean offline.
When operating Tungsten Replicator in a Tungsten cluster, you should always
set the number of channels to be the same for all replicators. Otherwise
you may run into problems if you try to restore backups across MySQL
instances that load with different locations.
If the replicator has only a single channel enabled, you can restore the
backup anywhere. The same applies if you run the backup after the
replicator has been taken offline cleanly.