This filter was introduced in version 7.1.4
The maskdata
filter enables
values in the THL to be obsfucated. This can be useful when replicating
Personal Identification Information, such as email addresses, phone
number, personal identification numbers and others within the THL but
need to be masked on the Target.
Use of this filter should be used with care. The strings generated are purely random and cannot be reverse engineered to the original string, additionally, the same random string will not be generated for the same input, for example "Continuent" may mask to "GuJKUgwnTG" on the first pass and "nnD57Gwby" on the second pass, therefore you MUST not use this filter to mask columns that are part of UNIQUE, PRIMARY or FOREIGN key constraints.
If you choose to set a value to NULL, then you must ensure the target column allows NULL values otherwise the replicator will error.
Pre-configured filter name |
maskdata
| ||
JavaScript Filter File |
tungsten-replicator/support/filters-javascript/maskdata.js
| ||
Property prefix |
replicator.filter.maskdata
| ||
Stage compatibility | All | ||
tpm Option compatibility |
--svc-extractor-filters ,
--svc-applier-filters
| ||
Data compatibility | ROW events only | ||
Parameters | |||
Parameter | Type | Default | Description |
definitionsFile
| Filename |
/opt/continuent/share/maskdata.json
| Location of the definitions file for masking columns |
The filter is available by default as
maskdata
, and the filter is
configured through a JSON file that defines the list of columns to be
masked. The filter relies on the
colnames filter being
enabled and will only work woth ROW based replication.
To enable the filter:
shell> tpm update --svc-extractor-filters=colnames,maskdata \
--property=replicator.filter.maskdata.definitionsFile=/opt/continuent/share/maskdata.json
A sample configuration file is provided in
tungsten-replicator/support/filters-config/maskdata.json
.
The JSON options vary depending on the masking required, and are outlined below along with an example:
[ { "schema": "hr", "table": "employees", "columns": [ { "column": "email", "format": "email", "method": 3 }, { "column": "phone_number", "format": "telno" }, { "column": "first_name", "format": "setnull" }, { "column": "last_name", "format": "alphastring", "special": "true", "cast": "upper" } ] } ]
Where:
schema
specifies the name of the
schema on which to apply the filtering. If
*
is given, all schemas are
matched.
table
specifies the name of the
table on which to apply the filtering. If
*
is given, all tables are
matched.
columns
is an array of column
names to be matched.
Format can be set to one of the following:
email
: This will generate a random string in an xxxx@xxxxx.xxx format. Both prefix
and domain, when masked, will match the original string length. With this format you
can specify 3 different methods using the method
option as follows:
1
: Will randomize the first part of the email address, but maintain the original domain.
For example, "j.doe@mydomain.co.uk" could mask as "HjnjK@mydomain.co.uk"
2
: Will randomize the domain part of the email address, but maintain the original prefix.
For example, "j.doe@mydomain.co.uk" could mask as "j.doe@GhRHMieD.com"
3
: Will randomize all parts of the email address.
For example, "j.doe@mydomain.co.uk" could mask as "HjnjK@GhRHMieD.com"
The cast
option can also be supplied to this option. See below for examples.
telno
or numericstring
: This will generate a random string of numbers.
The string length will match the original string.
alphastring
: This will generate a random string of Alpha characters in a mixed case. The
masked string length will match the original string length.
The cast
and special
options can also be supplied to this option. See below for examples.
alphanumericstring
: This will generate a random string of AlphaNumeric characters in a mixed case. The
masked string length will match the original string length.
The cast
and special
options can also be supplied to this option. See below for examples.
setnull
: This will set the value of the column to NULL
The use of cast
will control how the masked string is returned, this can be set to one of the following:
upper
: Return all UPPERCASE characters
lower
: Return all lowercase characters
init
: Returns all Lowercase with an Initial uppercase character
If not supplied, the generated string will be a random mix of upper and lower case.
The use of special
will also control the inclusion of special characters into the generate string. Set to true
to enable. Disabled by default.