11.1. Enabling/Disabling Filters

A number of standard filter configurations are created and defined by default within the static properties file for the Tungsten Replicator configuration.

Filters can be enabled through tpm to update the filter configuration

  • repl-svc-extractor-filters

    Apply the filter during the extraction stage, i.e. when the information is extracted from the binary log and written to the internal queue (binlog-to-q).

  • repl-svc-thl-filters

    Apply the filter between the internal queue and when the transactions are written to the THL on the Extractor. (q-to-thl).

  • repl-svc-remote-filters

    Apply the filter between reading from the remote THL server and writing to the local THL files on the Applier (remote-to-thl).

  • repl-svc-applier-filters

    Apply the filter between reading from the internal queue and applying to the destination database (q-to-dbms).

Properties and options for an individual filter can be specified by setting the corresponding property value on the tpm command-line.

For example, to ignore a database schema on a Applier, the replicate filter can be enabled, and the replicator.filter.replicate.ignore specifies the name of the schemas to be ignored. To ignore the schema contacts:

Update your /etc/tungsten/tungsten.ini as shown in the example below, followed by issuing tpm update:

shell> vi /etc/tungsten/tungsten.ini

[servicename]
...
repl-svc-applier-filters=replicate
property=replicator.filter.replicate.ignore=contacts
...

shell> tpm update

A bad filter configuration will not stop the replicator from starting, but the replicator will be placed into the OFFLINE state.

To disable a previously enabled filter, remove (or comment out) the values from you /etc/tungsten/tungsten.ini file, and issue tpm update

Multiple filters can be applied on any stage, and the filters will be processed and called within the order defined within the configuration. For example, with the following configuration:

[servicename]
...
repl-svc-applier-filters=enumtostring,settostring,pkey
...

The filters are called in order:

The order and sequence can be important if operations are being performed on the data and they are relied on later in the stage. For example, if data is being filtered by a value that exists in a SET column within the source data, the settostring filter must be defined before the data is filtered, otherwise the actual string value will not be identified.

Warning

In some cases, the filter order and sequence can also introduce errors. For example, when using the pkey filter and the optimizeupdates filters together, pkey may remove KEY information from the THL before optimizeupdates attempts to optimize the ROW event, causing the filter to raise a failure condition.

The currently active filters can be determined by using the trepctl status -name stages command:

shell> trepctl status -name stages
Processing status command (stages)...
...
NAME                 VALUE
----                 -----
applier.class      : com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier
applier.name       : dbms
blockCommitRowCount: 10
committedMinSeqno  : 3600
extractor.class    : com.continuent.tungsten.replicator.thl.THLParallelQueueExtractor
extractor.name     : parallel-q-extractor
filter.0.class     : com.continuent.tungsten.replicator.filter.MySQLSessionSupportFilter
filter.0.name      : mysqlsessions
filter.1.class     : com.continuent.tungsten.replicator.filter.PrimaryKeyFilter
filter.1.name      : pkey
filter.2.class     : com.continuent.tungsten.replicator.filter.BidiRemoteSlaveFilter
filter.2.name      : bidiSlave
name               : q-to-dbms
processedMinSeqno  : -1
taskCount          : 5
Finished status command (stages)...

The above output is from a standard Applier replication installation showing the default filters enabled. The filter order can be determined by the number against each filter definition.