Tungsten Replicator by default assigns channels using a round robin
algorithm that assigns each new shard to the next available channel. The
current shard assignments are tracked in table
trep_shard_channel
in the Tungsten
catalog schema for the replication service.
For example, if you have 2 channels enabled and Tungsten processes three different shards, you might end up with a shard assignment like the following:
foo => channel 0 bar => channel 1 foobar => channel 0
This algorithm generally gives the best results for most installations and
is crash-safe, since the contents of the
trep_shard_channel
table persist if
either the DBMS or the replicator fails.
It is possible to override the default assignment by updating the
shard.list
file found in the
tungsten-replicator/conf
directory. This file normally looks like the following:
# SHARD MAP FILE. # This file contains shard handling rules used in the ShardListPartitioner # class for parallel replication. If unchanged shards will be hashed across # available partitions. # You can assign shards explicitly using a shard name match, where the form # is <db>=<partition>. #common1=0 #common2=0 #db1=1 #db2=2 #db3=3 # Default partition for shards that do not match explicit name. # Permissible values are either a partition number or -1, in which # case values are hashed across available partitions. (-1 is the # default. #(*)=-1 # Comma-separated list of shards that require critical section to run. # A "critical section" means that these events are single-threaded to # ensure that all dependencies are met. #(critical)=common1,common2 # Method for channel hash assignments. Allowed values are round-robin and # string-hash. (hash-method)=round-robin
You can update the shard.list file to do three types of custom overrides.
Change the hashing method for channel assignments. Round-robin uses
the trep_shard_channel
table.
The string-hash method just hashes the shard name.
Assign shards to explicit channels. Add lines of the form
shard=channel
to the file as
shown by the commented-out entries.
Define critical shards. These are shards that must be processed in serial fashion. For example if you have a sharded application that has a single global shard with reference information, you can declare the global shard to be critical. This helps avoid applications seeing out of order information.
Changes to shard.list must be made with care. The same cautions apply here as for changing the number of channels or the parallelization type. For subscription customers we strongly recommend conferring with Continuent Support before making changes.