A number of standard filter configurations are created and defined by default within the static properties file for the Tungsten Replicator configuration.
Filters can be enabled through tpm to update the filter configuration
Apply the filter during the extraction stage, i.e. when the
information is extracted from the binary log and written to the
internal queue (
Apply the filter between the internal queue and when the transactions
are written to the THL on the master.
Apply the filter between reading from the remote THL server and
writing to the local THL files on the slave
Apply the filter between reading from the internal queue and applying
to the destination database
Properties and options for an individual filter can be specified by setting the corresponding property value on the tpm command-line.
For example, to ignore a database schema on a slave, the
replicate filter can be enabled, and
specifies the name of the schemas to be ignored. To ignore the schema
./tools/tpm update alpha --hosts=host1,host2,host3 \ --repl-svc-applier-filters=replicate \ --property=replicator.filter.replicate.ignore=contacts
A bad filter configuration will not stop the replicator from starting, but
the replicator will be placed into the
To disable a previously enabled filter, empty the filter specification and (optionally) unset the corresponding property or properties. For example:
./tools/tpm update alpha --hosts=host1,host2,host3 \ --repl-svc-applier-filters= \ --remove-property=replicator.filter.replicate.ignore
Multiple filters can be applied on any stage, and the filters will be processes and called within the order defined within the configuration. For example, the following configuration:
./tools/tpm update alpha --hosts=host1,host2,host3 \ --repl-svc-applier-filters=enumtostring,settostring,pkey \ --remove-property=replicator.filter.replicate.ignore
The filters are called in order:
The order and sequence can be important if operations are being performed
on the data and they are relied on later in the stage. For example, if
data is being filtered by a value that exists in a
SET column within the source data,
settostring filter must be
defined before the data is filtered, otherwise the actual string value
will not be identified.
In some cases, the filter order and sequence can also introduce errors.
For example, when using the
filter and the
pkey may remove
KEY information from the THL before
optimizeupdates attempts to
optimize the ROW event, causing the filter to raise a failure condition.
The currently active filters can be determined by using the trepctl status -name stages command:
trepctl status -name stagesProcessing status command (stages)... ... NAME VALUE ---- ----- applier.class : com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier applier.name : dbms blockCommitRowCount: 10 committedMinSeqno : 3600 extractor.class : com.continuent.tungsten.replicator.thl.THLParallelQueueExtractor extractor.name : parallel-q-extractor filter.0.class : com.continuent.tungsten.replicator.filter.MySQLSessionSupportFilter filter.0.name : mysqlsessions filter.1.class : com.continuent.tungsten.replicator.filter.PrimaryKeyFilter filter.1.name : pkey filter.2.class : com.continuent.tungsten.replicator.filter.BidiRemoteSlaveFilter filter.2.name : bidiSlave name : q-to-dbms processedMinSeqno : -1 taskCount : 5 Finished status command (stages)...
The above output is from a standard slave replication installation showing the default filters enabled. The filter order can be determined by the number against each filter definition.