Tungsten Replicator 7.0.2
Release 7.0.2 contains a number of key bug fixes and improvements.
v7.0.2 was originally released on 9th Dec 2022 as build 145, and re-released on 9th Jan 2023 as build 161
Behavior Changes (4)
The following changes may affect existing scripts and integration tools. Any scripts or environment which make use of these tools should check and update for the new configuration:
Command-line Tools (3)
- Both the
tungsten_get_portsandtpm reportcommands have been updated to use thessOS command when thenetstatOS command is unavailable or deprecated.Issue: CT-2007 - The
check_tungsten.shwas deprecated in release 6.1.18 and has now been removed from this release.Issue: CT-1939 - The various user-xxxx.log files are no longer generatedIssue: CT-1914
Core Replicator (1)
repl-svc-extractor-multi-frag-service-detectionis now turned **ON** by default. Event shards are determined at extraction time. With fragmented events, the shard cannot be determined by only reading the first fragment, but needs to check the last fragment as well. With this setting turned OFF, there is no issue with pipelines that don't need it, i.e. no parallel apply downstream replicas. However, as this is done at extract time, THL contains this information, and adding or changing a replica using parallel apply could introduce issues.NoteIt can be disabled if you see a performance overhead but this should be done with caution. For Aurora<>Aurora Active/Active deployments it is essential that this property be left ON.Issue: CT-1959
Improvements, new features and functionality (16)
Command-line Tools (11)
- Issue: CT-1869
- Issue: CT-2012
- The
tpm reportcommand now prints the hostname and listener ports where available when using the--extra|-xoption or the new--portsoption.Issue: CT-1969 - The
tpm askcommand has five new variables available:dsrole&dsstatefor the current datasource, andtrrole&trstatefor the current replicator, andnodeinfowhich displays all 4 of the new variables.Issue: CT-1944 - The
tpm ask stagesandtpm ask allstagescommands have been added to display the Replicator stages for the current node (stages) and the stages for each role (allstages).Issue: CT-1943 - The
tpmcommand calls toglobhave been improved to be more strict and compliant.Issue: CT-1940 - The
tmonitorcommand now accepts cli args to specify the ports and will auto-configure the ports if they have been changed via the Tungsten configuration.Issue: CT-1919 - Added a new log file (
tungsten-replicator/log/data-drift.log) for data drift messages, i.e. :- an update statement was logged on primary, but did not update any row on replica
- a delete statement was logged on primary, but did not delete any row on replica
Issue: CT-1873 The
trepctl resetwill now show the last known applied seqno and latency.This information is stored on disk at regular intervals (10s minimum) so as not to overload the replicator, therefore the value can be shown as slightly old dependant on when the status command was issued.
By default, this feature is disabled. It can be enabled by setting the following parameter in the configuration :
svc-applier-last-applied-write-interval=20This will write current position to disk every 20 seconds. This information is also exported by the Prometheus exporter.
If the service is online, it will display the current value (the same as appliedLastSeqno and appliedLatency)
shell> trepctl statusProcessing status command...NAME VALUE---- -----appliedLastEventId : mysql-bin.000017:0000000151329854;70appliedLastSeqno : 999appliedLatency : 347707.0...lastKnownAppliedLatency: 347707.0lastKnownAppliedSeqno : 999...Issue: CT-1823A new
-coption is now available with sometrepctlcommands that can be used in conjunction with the-roption to indicate the number of times to refresh before automatically terminating. For example, the following command:shell> trepctl perf -r 3 -c 10Will refresh the output every 3 seconds, 10 times.
Issue: CT-679rsyncis now an option intprovision, in addition to using xtrabackup and mysqldump. To use rsync, specify-m rsync.Using rsync by default will provision a replica in 2 passes:
- The first pass will live copy (seed) the replica from the source.
- The second pass will quiesce the source and run the rsync again, resulting in shorter down time than a single pass rsync
Issue: CT-338
Core Replicator (4)
Added a new feature that enables pausing a replicator stage for some amount of time.
This will pause the given stage for 100 seconds.:
trepctl pause -stage thl-to-q -time 100This will pause the stage indefinitely (or until restart, etc) Add -y to avoid the prompt message whether you are sure.
trepctl pause -stage thl-to-qFor the previous 2 commands, running a pause command again will override the previous command.
This will resume the suspended stage (Note that if the stage is not paused, this will have no effect):
trepctl resume -stage thl-to-qNotePlease note this pause does not survive a replicator restart or a service offline/online.Issue: CT-1912Added a way to configure the maximum number of rows that can be grouped together when applying row based events for multiple insert or delete statements.
For these properties to be in effect, you must ensure that
optimize-row-events=trueis either explicitly set in your configuration, or not present (since it will be enabled by default)For example, the following settings will limit the number of inserted or deleted rows applied at once to 10:
optimize-row-events-limit-insert-rows=10optimize-row-events-limit-delete-rows=10The default values if not specified will be 50 for inserts and 100 for deletes. Note that for deletes to be optimized, the affected table MUST have a single column PK.
Issue: CT-1980- A new replicator role (thl-applier) has been added to allow a replicator service to apply its locally available THL, without pulling from a remote hostIssue: CT-1936
- Per-service tuning of the replicator thl directory is now possible for multi-service replicator-only installs as well as for clustering. The given value should be the base directory, to which tungsten will add the service name. For example, the following entry in the tungsten.ini:Would result in the THL being placed in /drv1/thl/alpha[alpha]......thl-directory=/drv1/thl...NoteUpdate of thl directory is only available when tpm is called from the staging installation directory, **NOT** from the running directory.Issue: CT-1927
API (1)
- A new log file has been added for the REST API, it is as follows:
- service_logs/replicator-api.log
Issue: CT-1983
Bug Fixes (24)
Installation and Deployment (4)
- Fixes issues where fixed properties and filters passed to tpm in service stanzas were not being configured correctlyIssue: CT-1463
- The
tpm installandtpm updatecommands now properly support thethl-portoption for cross-site subservices.Issue: CT-1953 - No longer using Tanuki wrapper functionality to print jvm version, which was creating defunct java processes at startup, now using internal code.Issue: CT-1876
- Issue: CT-813
Command-line Tools (11)
- The
tpm diagcommand now passes when the nodename defined in thetungsten.iniis the shortname, and DNS returns the FQDN.Issue: CT-1908 - Fixes an issue that prevented
ddlscanfrom connecting to MySQL if SSL was enabled.Issue: CT-1808 - TheNoteThis fix was released in Tungsten Clustering and Tungsten Replicator 7.0.2 Build 161.
tpmcommand checks for the existence of themysqlcommand-line client when installing/upgrading. The process will no longer abort with an error on non-MySQL targets such as heterogeneous replicator appliers, or Active-Witness hosts.Issues: CT-1924, CT-2018 - Both TungstenAPI and
tpasswdnow properly update .passwords.store.orig backup file so that proper manipulation of passwords won't triggertpm updatefailureIssue: CT-1981 - The
tpm mysqlcommand no longer aborts with an access denied error on CentOS 6.Issue: CT-1977 - The
tpm mysqlcommand will now gracefully handle being run on a non-database node.Issue: CT-1946 - Fixes an issue that prevented
dsctlreacting to MySQL if SSL was enabled.Issue: CT-1928 - The
tpm diagcommand now gathers themysql.logfile when SSL is enabled in the server.Issue: CT-1920 - The
tpmcommand now allows any case for section entries (i.e. [alpha_FROM_beta]) in the INI files.Issue: CT-1879 - The
tungsten_skip_seqnocommand no longer fails when-iis specified, and now properly filters using--filterwhen there is a long error message.Issue: CT-1877 - Running
tpm updateon a standalone Replicator deployment no longer prints an uninitialized value error.Issue: CT-1807
Core Replicator (7)
- Fixed an issue where filtered events would trigger a useless update to the service trep_commit_seqno table while it is overwritten anyway once the last statement of the applied event is done, just prior to committing the whole block.Issue: CT-1931
- Fixes an issue that prevented geometry datatypes with SRID from being replicated.Issue: CT-1904
- Fixed a possible issue when recovering an old primary as a replica after failover when parallel apply is enabled, that could lead the replica to be unable to come online and require a reprovisioning of this replica.Issue: CT-1890
- Fixes an issue that would prevent a service from going offline at a specified time (
trepctl online -until-time) when parallel apply is enabled. This is a rework of CT-1243.Issue: CT-1684 - Fixed a parsing issue that would prevent the replicator from correctly detecting a
CREATE TABLEstatement withSTART TRANSACTIONIssue: CT-1987 - Fixed an issue where the replicator would hang after applying a
DROP TABLEevent, that originally failed on the primary, but got logged into the binlog.Issue: CT-1973 - Fixed an issue in the applier to ensure it is not committing too often with a multi-service replicator (active/active for example, more specifically with AWS Aurora, or an unprivileged MySQL, backend).Issue: CT-2004
Filters (1)
- Fixed an issue where the
dropsqlmodesfilter would fail to remove invalid sql modes from a multi-statement eventIssue: CT-1993
Manager (1)
- A bug has been fixed that, in a few very rare cases, would allow replicas to continue to pull and apply THL from a failed primary whilst a failover was in the process of electing a new primary. This resulted in failovers being unable to complete fully. Whilst the new primary would be online and functioning, existing replicas in the cluster could experience errors due to THL discrepencies between the old and new primary nodes.Issue: CT-1986