12.4.26. maskdata.js Filter

Note

This filter was introduced in version 7.1.4

The maskdata filter enables values in the THL to be obsfucated. This can be useful when replicating Personal Identification Information, such as email addresses, phone number, personal identification numbers and others within the THL but need to be masked on the Target.

Warning

Use of this filter should be used with care. The strings generated are purely random and cannot be reverse engineered to the original string, additionally, the same random string will not be generated for the same input, for example "Continuent" may mask to "GuJKUgwnTG" on the first pass and "nnD57Gwby" on the second pass, therefore you MUST not use this filter to mask columns that are part of UNIQUE, PRIMARY or FOREIGN key constraints.

If you choose to set a value to NULL, then you must ensure the target column allows NULL values otherwise the replicator will error.

Pre-configured filter name maskdata
JavaScript Filter File tungsten-replicator/support/filters-javascript/maskdata.js
Property prefix replicator.filter.maskdata
Stage compatibility All
tpm Option compatibility --svc-extractor-filters, --svc-applier-filters
Data compatibility ROW events only
Parameters
Parameter Type Default Description
definitionsFile Filename /opt/continuent/share/maskdata.json Location of the definitions file for masking columns

The filter is available by default as maskdata, and the filter is configured through a JSON file that defines the list of columns to be masked. The filter relies on the colnames filter being enabled and will only work woth ROW based replication.

To enable the filter:

shell> tpm update --svc-extractor-filters=colnames,maskdata \
    --property=replicator.filter.maskdata.definitionsFile=/opt/continuent/share/maskdata.json

A sample configuration file is provided in tungsten-replicator/support/filters-config/maskdata.json. The JSON options vary depending on the masking required, and are outlined below along with an example:

[
  {
    "schema": "hr",
    "table": "employees",
    "columns": [
    {
      "column": "email",
      "format": "email",
      "method": 3
    },
    {
      "column": "phone_number",
      "format": "telno"
    },
    {
      "column": "first_name",
      "format": "setnull"
    },
    {
      "column": "last_name",
      "format": "alphastring",
      "special": "true",
      "cast": "upper"
    }
    ]
  }
]

Where:

  • schema specifies the name of the schema on which to apply the filtering. If * is given, all schemas are matched.

  • table specifies the name of the table on which to apply the filtering. If * is given, all tables are matched.

  • columns is an array of column names to be matched.

Format can be set to one of the following:

  • email : This will generate a random string in an xxxx@xxxxx.xxx format. Both prefix and domain, when masked, will match the original string length. With this format you can specify 3 different methods using the method option as follows:

    • 1 : Will randomize the first part of the email address, but maintain the original domain. For example, "j.doe@mydomain.co.uk" could mask as "HjnjK@mydomain.co.uk"

    • 2 : Will randomize the domain part of the email address, but maintain the original prefix. For example, "j.doe@mydomain.co.uk" could mask as "j.doe@GhRHMieD.com"

    • 3 : Will randomize all parts of the email address. For example, "j.doe@mydomain.co.uk" could mask as "HjnjK@GhRHMieD.com"

    The cast option can also be supplied to this option. See below for examples.

  • telno or numericstring : This will generate a random string of numbers. The string length will match the original string.

  • alphastring : This will generate a random string of Alpha characters in a mixed case. The masked string length will match the original string length.

    The cast and special options can also be supplied to this option. See below for examples.

  • alphanumericstring : This will generate a random string of AlphaNumeric characters in a mixed case. The masked string length will match the original string length.

    The cast and special options can also be supplied to this option. See below for examples.

  • setnull : This will set the value of the column to NULL

The use of cast will control how the masked string is returned, this can be set to one of the following:

  • upper : Return all UPPERCASE characters

  • lower : Return all lowercase characters

  • init : Returns all Lowercase with an Initial uppercase character

If not supplied, the generated string will be a random mix of upper and lower case.

The use of special will also control the inclusion of special characters into the generate string. Set to true to enable. Disabled by default.