5.5.7. Controlling Assignment of Shards to Channels

Tungsten Replicator by default assigns channels using a round robin algorithm that assigns each new shard to the next available channel. The current shard assignments are tracked in table trep_shard_channel in the Tungsten catalog schema for the replication service.

For example, if you have 2 channels enabled and Tungsten processes three different shards, you might end up with a shard assignment like the following:

foo => channel 0
bar => channel 1
foobar => channel 0

This algorithm generally gives the best results for most installations and is crash-safe, since the contents of the trep_shard_channel table persist if either the DBMS or the replicator fails.

It is possible to override the default assignment by updating the shard.list file found in the tungsten-replicator/conf directory. This file normally looks like the following:

# SHARD MAP FILE.
# This file contains shard handling rules used in the ShardListPartitioner
# class for parallel replication.  If unchanged shards will be hashed across
# available partitions.

# You can assign shards explicitly using a shard name match, where the form
# is <db>=<partition>.
#common1=0
#common2=0
#db1=1
#db2=2
#db3=3

# Default partition for shards that do not match explicit name.
# Permissible values are either a partition number or -1, in which
# case values are hashed across available partitions.  (-1 is the
# default.
#(*)=-1

# Comma-separated list of shards that require critical section to run.
# A "critical section" means that these events are single-threaded to
# ensure that all dependencies are met.
#(critical)=common1,common2

# Method for channel hash assignments.  Allowed values are round-robin and
# string-hash.
(hash-method)=round-robin

You can update the shard.list file to do three types of custom overrides.

  1. Change the hashing method for channel assignments. Round-robin uses the trep_shard_channel table. The string-hash method just hashes the shard name.

  2. Assign shards to explicit channels. Add lines of the form shard=channel to the file as shown by the commented-out entries.

  3. Define critical shards. These are shards that must be processed in serial fashion. For example if you have a sharded application that has a single global shard with reference information, you can declare the global shard to be critical. This helps avoid applications seeing out of order information.

Changes to shard.list must be made with care. The same cautions apply here as for changing the number of channels or the parallelization type. For subscription customers we strongly recommend conferring with Continuent Support before making changes.