Version End of Life. 31 July 2020
Release 5.3.0 is an important feature release that contains some key new functionality for replication. In particular:
JSON data type column extraction support for MySQL 5.7 and higher.
Generated column extraction support for MySQL 5.7 and higher.
DDL translation support for heterogeneous targets, initially support DDL translation for MySQL to MySQL, Vertica and Redshift targets.
Support for data concentration support for replication into a single target schema (with additional source schema information added to each table) for both HPE Vertica and Amazon Redshift targets.
Rebranded and updated support for Oracle extraction with the Oracle Redo Reader, including improvements to offboard deployment, more configuration options, and support for the deployment and installation of multiple offboard replication services within a single replicator.
This release also contains a number of important bug fixes and minor improvements to the product.
Improvements, new features and functionality
The way that information is logged has been improved so that it should be easier to identify and find errors and the causes of errors when looking at the logs. To achieve this, logging is now provided into an additional file, one for each component, and the new files contain only errors at the WARNING or ERROR levels. The new file is
replicator-user.log
. The original file,trepsvc.log
remains unchanged.All log files have been updated to ensure that where relevant the service name for the corresponding entry is included. This should further help to identify and pinpoint issues by making it clearer what service triggered a particular logging event.
Issues: CT-30, CT-69
Support for Java 7 (JDK or JRE 1.7) has been deprecated, and will be removed in the 6.0.0 release. The software is compiled using Java 8 with Java 7 compatibility.
Issues: CT-252
Some Javascript filters had DOS style line breaks.
Issues: CT-376
Support for JSON datatypes and generated columns within MySQL 5.7 and greater has been added to the MySQL extraction component of the replicator.
Important
Due to a MySQL bug, the way that JSON and generated columns is represented within MySQL binary log, it is possible for the size of the data, and the reported size re different and this could cause data corruption To account for this behavior and to prevent data inconsistencies, the replicator can be configured to either ignore, warn, or stop, if the mismatch occurs.
This can be set by modifying the property replicator.extractor.dbms.json_length_mismatch_policy.
Until this problem is addressed within MySQL, tpm will still generate a warning about the issue which can be ignored during installation by using the
--skip-validation-check=MySQLGeneratedColumnCheck
.For more information on the effects of the bug, see MySQL Bug #88791.
Issues: CT-5, CT-468
The tpm command has been updated to correctly operate with CentOS 7 and higher. Due to an underlying change in the way IP configuration information was sourced, the extraction of the IP address information has been updated to use the ip addr command.
Issues: CT-35
The THL retention setting is now checked in more detail during
installation. When the
--thl-log-retention
is
configured when extracting from MySQL, the value is compared to
the binary log expiry setting in MySQL
(expire_logs_days
).
If the value is less, then a warning is produced to highlight
the potential for loss of data.
Issues: CT-91
A new option,
--oracle-redo-temp-tablespace
has been added to configure the temporary tablespace within
Oracle redo reader extractor deployments.
Issues: CT-321
The sizes outputs for the thl list command,
such as -sizes
or
-sizesdetail
command now
additionally output summary information for the selected THL
events:
Total ROW chunks: 8 with 7 updated rows (50%) Total STATEMENT chunks: 8 with 2552 bytes (50%) 16 events processed
A new option has also been added,
-sizessummary
, that only
outputs the summary information.
Issues: CT-433
For more information, see thl list -sizessummary Command.
A new option for tpm has been added,
--oracle-tns-port
, which is an
alias for
--replication-port
.
Issues: CT-274
The fetcher and miner ports can now be explicitly set.
Previously they were fixed as port 7901 and 7902 respectively.
Use the
--oracle-redo-fetcher-port
and --oracle-redo-miner-port
.
Issues: CT-290
The HPE Vertica applier has been updated and expanded so that
data can be concentrated from multiple source schemas into a
single schema, where all the souce and target schemas share a
common table structure. The new functionality relies on the new
adddbrowname
filter, and a
new batch applier script that handles the concentration.
This functionality also incorporates options to keep a longterm copy of all the CDC data generated by the replicator by copying the data to a secondary set of staging tables. Both this and the core target information are configurable during installation.
Full documentation on using this feature is under production and will be available shortly.
Issues: CT-95
Support has now been added for a full DDL replication and translation support, initially from MySQL targets through to Amazon Redshift and HPE Vertica. The functionality allows for schemas and tables to be created, modified, and deleted, without the need to tuse ddlscan, and without having to worry about making changes that stop replication until the structures can be changed.
The DDL translation supports the following features:
Full replication of schema and table operations.
Configurable translation of data types, including size differences.
Automatically creates staging tables for batch-based appliers.
Support for centralized and long term schema replication.
Ability to add arbitrary columns to all replicated tables.
Ability to choose whether to apply different schema operations on specific schemas or tables. The following options can be controlled:
Creating schema
Creating table
Adding columns to existing table
Deleting columns from existing table
Modifying columns in existing table
Deleting table
Deleting schema
For each operation, the operationg can be applied, ignored, stop replication with an error, or applied with archiving. In the case of the last example, a copy of the table is kept, and changes are applied only to the active table. This enables you to retain existing data and structure so that analytics can continue on a known version of the table. The naming and format of the table can also be set.
For operations that add or change columns, you choose whether value for the new column within the existing rows for the table are set to the default value, or an explicit value.
Data is automatically flushed and committed before table changes are made to ensure that replication does not stop. This process happens automatically, so replicating data, adding a column, and replicating further data does not stop replication, even if the data would normally fail because of table differences and batch applier timings.
Existing table schemas can be extracted and replicated automatically through to a target without requiring ddlscan to create the initial tables.
Full documentation on using this feature is under production and will be available shortly.
Issues: CT-131, CT-132
The Javascript files used for applying data into batch targets (Redshift, Hadoop, Vertica) have been updated and improved to ensure:
Field names are correctly escaped
Error messages now contain more information about the problem
Where relevant, the host database errors and CSV files are now kept in the event of an error to help identification of the underlying problem.
These changes should make it easier to identify issues, and to prevent certain issues occurring during replication.
Issues: CT-96, CT-235
The CSV writer module which is used in all batch-related appliers (Redshift, Hadoop, Vertica) has been updated so that it provides more information about the potential problem when a CSV write is identified as invalid.
Issues: CT-236
Support for replicating into Hadoop environments where the
underlying filesystem is protected by Kerberos security and
authentication has been added to the Hadoop applier. A new file,
hadoop_kerberos.js
has been added to the distribution which should be edited and
used in place of the normal
hadoop.js
batch
file.
Issues: CT-266
For more information, see Replicating into Kerberos Secured HDFS.
The Amazon Redshift applier has been updated and expanded so
that data can be concentrated from multiple source schemas into
a single schema, where all the souce and target schemas share a
common table structure. The new functionality relies on the new
adddbrowname
filter, and a
new batch applier script that handles the concentration.
Full documentation on using this feature is under production and will be available shortly.
Issues: CT-408
A new filter,
rowadddbname
, has been
added to the replicator. This filter adds the incoming schema
name, and optional numeric hash value of the schema, to every
row of THL row-based changes. The filter is designed to be used
with heterogeneous and analytics applications where data is
being concentrated into a single schema and where the source
schema name will be lost during the concentration and
replication process.
In particular, it is designed to work in harmony with the new Redshift and Vertica based single-schema appliers where data from multiple, identical, schemas are written into a single target schema for analysis.
Issues: CT-98
A new filter has been added,
rowadddbname
,
which adds the source database name and optional database hash
to every incoming row of data. This can be used to help identify
source information when concentrating information into a single
schema.
Issues: CT-407
An issue has been identified with the way certain operating systems now configure their open files limits, which can upset the checks within tpm that determine the open files limits configured for MySQL. To ensure that the open files limit has been set correctly, check the configuration of the service:
Copy the system configuration:
shell>sudo cp /lib/systemd/system/mysql.service /etc/systemd/system/
shell>sudo vim /etc/systemd/system/mysql.service
Add the following line to the end of the copied file:
LimitNOFILE=infinity
Reload the systemctl daemon:
shell> sudo systemctl daemon-reload
Restart MySQL:
shell> service mysql restart
That configures everything properly and MySQL should now take
note of the
open_files_limit
config
option.
Issues: CT-148
The check to determine if triggers had been enabled within the MySQL data source would not get executed correctly, meaning that warnings about unsupported triggers would not trigger a notification.
Issues: CT-185
When using tpm diag on a MySQL deployment,
the MySQL error log would not be identified and included
properly if the default
datadir
option was
not
/var/lib/mysql
.
Issues: CT-359
Installation when enabling security through SSL could fail intermittently during installation because the certificates would fail to get copied to the required directory during the installation process.
Issues: CT-402
The Net::SSH libraries used by tpm have been
updated to reflect the deprecation of
paranoid
parameter.
Issues: CT-426
Using a complex password, particularly one with single or double quotes, when specifying a password for tpm, could cause checks and the installation to raise errors or fail, although the actual configuration would work properly. The problem was limited to internal checks by tpm only.
Issues: CT-440
The startall command would fail to correctly start the Oracle redo reader process.
Issues: CT-283
The tpm command would fail to remove the Oracle redo reader user when using tpm uninstall.
Issues: CT-299
The replicator stop command would not stop the Oracle redo reader process.
Issues: CT-300
Within Vertica deployments, the internal identity of the applier was set incorrectly to PostgreSQL. This would make it difficult for certain internal processes to identify the true datasource type. The setting did not affect the actual operation.
Issues: CT-452
Oracle deployments have been updated so that the replicator is
always running in UTF-8 and the
NLS_LANG
setting is
set correctly. This will affect primarily CDC and Oracle applier
deployments.
Issues: CT-251
The ddlscan templates for Oracle to MySQL
would incorrectly map
NUMBER
types into DECIMAL
with an invalid size definition. This has been updated so that
anything larger than a 19 digit
NUMBER
to
a MySQL BIGINT
.
Issues: CT-259
The Oracle redo reader component has been rebranded to Continuent, Ltd, and changed internally to be identified as simply 'oracle redo reader'. This has changed the following elements within the product:
All components and references to vmrr and vmrrd have been changed to orarr and orarrd respectively.
All tpm options that contain
vmware
have been
replaced with oracle
,
including:
install-vmware-redo-reader
|
install-oracle-redo-reader
|
repl-install-vmware-redo-reader
|
repl-install-oracle-redo-reader
|
All internal references, including the configuration
paameters for the redo reader, have been updated to use
orarr
.
The default username and password used with the redo reader
have changed from
vmrruser
to
orarruser
, and
vmrruserpwd
to
orrruserpwd
.
The template files used to configure the redo reader have
been changed from
vmrr_response_file
to
orarr_response_file
,
and
vmrr_response_file
to
offboard_orarr_response_file
.
The vmrrd_wrapper
has been renamed to
orarrd_wrapper
.
Issues: CT-19, CT-282, CT-367
When running the orarrd command to execute the console, the command would fail and report:
When running orarrd console, you get the following response:
tungsten@dbora1 alpha$ orarrd_alpha console
orarr is already started
Issues: CT-397
The orarrrd script contained incorrect environment variables for testing the validity of the installation. This could cause access to the Redo Reader console to fail.
Issues: CT-401