Continuent Documentation

10.4.29. PrimaryKey Filter

The PrimaryKey adds primary key information to row-based replication data. This is required by heterogeneous environments to ensure that the primary key is identified when updating or deleting tables. Without this information, the primary to use, for example as the document ID in a document store such as MongoDB, is generated dynamically. In addition, without this filter in place, when performing update or delete operations a full table scan is performed on the target dataserver to determine the record that must be updated.

Pre-configured filter name pkey
Classname com.continuent.tungsten.replicator.filter.PrimaryKeyFilter
Property prefix replicator.filter.pkey
Stage compatibility binlog-to-q
tpm Option compatibility --repl-svc-extractor-filters
Data compatibility Row events
Parameters
Parameter Type Default Description
user string ${replicator.global.extract.db.user} The username for the connection to the database for looking up column definitions
password string ${replicator.global.extract.db.password} The password for the connection to the database for looking up column definitions
url string jdbc:mysql:thin://${replicator.global.extract.db.host}:${replicator.global.extract.db.port}/${replicator.schema}?createDB=true JDBC URL of the database connection to use for looking up column definitions
addPkeyToInsert boolean false If set to true, primary keys are added to INSERT operations. This setting is required for batch loading
addColumnsToDeletes boolean false If set to true, full column metadata is added to DELETE operations. This setting is required for batch loading

Note

This filter is designed to be used for testing and with heterogeneous replication where the field name information can be used to construct and build target data structures.

For example, in the following THL fragment, the key information includes data for all columns, which the is the default behavior for UPDATE and DELETE operations.

SEQ# = 142 / FRAG# = 0 (last frag)
- TIME = 2013-08-01 19:31:04.0
- EPOCH# = 122
- EVENTID = mysql-bin.000012:0000000000022187;0
- SOURCEID = host31
- METADATA = [mysql_server_id=1;dbms_type=mysql;service=alpha;shard=test]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [foreign_key_checks = 1, unique_checks = 1]
- SQL(0) =
 - ACTION = UPDATE
 - SCHEMA = test
 - TABLE = salesadv
 - ROW# = 0
  - COL(1: id) = 2
  - COL(2: country) = 1
  - COL(3: city) = 8374
  - COL(4: salesman) = 1
  - COL(5: value) = 89000.00
  - KEY(1: id) = 2
  - KEY(2: country) = 1
  - KEY(3: city) = 8374
  - KEY(4: salesman) = 1
  - KEY(5: value) = 89000.00

When the PrimaryKey is enabled, the key information has been optimized to only contain the actual primary keys are added to the row-based THL record:

SEQ# = 142 / FRAG# = 0 (last frag)
- TIME = 2013-08-01 19:31:04.0
- EPOCH# = 122
- EVENTID = mysql-bin.000012:0000000000022187;0
- SOURCEID = host31
- METADATA = [mysql_server_id=1;dbms_type=mysql;service=alpha;shard=test]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [foreign_key_checks = 1, unique_checks = 1]
- SQL(0) =
 - ACTION = UPDATE
 - SCHEMA = test
 - TABLE = salesadv
 - ROW# = 0
  - COL(1: id) = 2
  - COL(2: country) = 1
  - COL(3: city) = 8374
  - COL(4: salesman) = 1
  - COL(5: value) = 89000.00
  - KEY(1: id) = 2

The final line shows the addition of the primary key id added to THL event.

The two options, addPkeyToInsert and addColumnsToDeletes add the primary key information to INSERT and DELETE operations respectively. In a heterogeneous environment, these options should be enabled to prevent full-table scans during update and deletes.