1.7. Tungsten Clustering 7.0.0 GA (29 Mar 2022)

Version End of Life. Not Yet Set

Release 7.0.0 is a major release introducing many new features including a fully documented API. There are a large number of bug fixes and improvements in all areas of the product, and a number of key behavior changes, most significantly being Security is now enabled by default.

Behavior Changes

The following changes have been made to Tungsten Cluster and may affect existing scripts and integration tools. Any scripts or environment which make use of these tools should check and update for the new configuration:

  • Installation and Deployment

    • An RPM update now issues --replace-release

      Issues: CT-1708

  • Command-line Tools

    • The tpm diag command now uses tar czf instead of the zip command to compress the gathered files. The zip command is no longer a pre-requisite for tpm diag.

      Issues: CT-1253

    • tungsten_set_position has been deprecated and no longer available in this release. dsctl should be used instead.

      tungsten_provision_slave has now been renamed to tprovision

      Issues: CT-1302

    • The tungsten_post_process script functionality has been merged into the tpm post-process command. The tungsten_post_process script remains as a shell wrapper for tpm post-process.

      Issues: CT-1314

    • The tungsten_find_orphaned script now creates a log file every time it runs which is stored in the configured temporary directory (/tmp/ by default, tpm query values temp_directory). This is to allow for easier troubleshooting and visibility during automatic execution.

      Issues: CT-1447

    • tpm now accepts chrony as a valid time synchronization software

      Issues: CT-1462

    • The tpm diag command now uses the ss (socket status) command in place of netstat on SUSE and other operating systems that have deprecated netstat.

      Issues: CT-1483

    • The tpm diag command now gathers the /etc/os-release file when located. Also now using the ip command on systems where ifconfig and/or route is deprecated.

      Issues: CT-1496

    • Changed output of thl purge command when no lower and upper bounds are given from 'Deleting events where' to 'Deleting all events'.

      Issues: CT-1738

  • Backup and Restore

    • --user and --password options have been added to the following scripts:

      If manager-rest-api-authentication=true (the default if not explicitly disabled) then these two new options must be supplied otherwise the scripts will fail with the following error: "ERROR >> Manager REST API authentication needed. Please specify the user name and password."

      The internal ruby api.rb module can now also handle the manager-rest-api-ssl=true (the default if not explicitly disabled) and will use https instead of http to access the REST API

      Issues: CT-1311

    • From v8 of xtrabackup, the --stream=tar option was removed, meaning that backups could fail if using the newer release of the Percona tools

      In this release, the backups will now create the backup and then manually compress it

      Warning

      This change will increase the required disk space for backups to allow the post-backup compression to complete.

      Issues: CT-1346

    • tungsten_provision_slave has now been renamed to tprovision.

      Issues: CT-1436

    • Additional messaging has been added to the output displayed when running tprovision.

      Issues: CT-1689

  • Core Replicator

    • Replicator will now check after extracting a mysql STOP_EVENT whether a new binlog file was created, actually handling this event as a ROTATE EVENT.

      This decreases the time that would be needed after STOP_EVENT to extract a new event..

      Issues: CT-1349

  • Tungsten Connector

    • Optimized transaction parsing by removing SQL transaction start recognition: in-transaction state is found in server status flags.

      Issues: CT-1698

  • Tungsten Manager

    • A new logfile, console.log is now generated in the $CONTINUENT_ROOT/tungsten/tungsten-manager/logs directory which contains all output displayed via cctrl. This file will provide Continuent Support with more valuable information when assisting to diagnose support cases.

      Issues: CT-1499

  • Other Issues

    • Log files for each component now have the same date and time stamp format.

      Issues: CT-1669

  • API

    • API calls triggering configuration changes are protected by a flag, i-am-sure=true, in order to avoid unwanted, potentially dramatic, configuration changes. This applies to:

      • configuration/module/servicesmap

      • reset

      • offline

      • online

      • onhold

      • addDataService

      • addDataSource

      Issues: CT-1317

Known Issue

The following issues are known within this release but not considered critical, nor impact the operation of Tungsten Cluster. They will be addressed in a subsequent patch release.

  • Installation and Deployment

    • After starting up Tungsten components, a defunct process for each running component can be found in the process listing.

      Whilst this does not cause any issues, it could generate unnecessary alerts for customers monitoring.

      The cause has been identified and affects version 7.0.0 and 7.0.1. This will be fixed in the next 7.0.2 release.

      Issues: CT-1876

  • Command-line Tools

    • The check_tungsten_online command returns a Replicator offline error on active witness hosts.

      Issues: CT-1783

    • The tpm policy command may return the incorrect policy value for composite clusters.

      As a result of this known issue, the tungsten_reset_manager command will also have issues on composite clusters because it calls tpm policy.

      Issues: CT-1787

Improvements, new features and functionality

  • Installation and Deployment

    • Two new tpm options have been included as part of the new API in this release.

      The options are used for setting the API admin user credentials and are as follows:

      Issues: CT-1327

    • Support now included for MariaDB 10.3+

      Issues: CT-1276, CT-1433

    • Support has been added for Java 17 LTS

      Issues: CT-1706

  • Command-line Tools

    • A new tpm option delete-service is now available to simplify the removal of clusters and/or replicator services.

      Issues: CT-210, CT-327, CT-1275

    • Prometheus exporters mysqld_exporter and node_exporter are now included with the distribution packages.

      A new command line tool tmonitor is now available for the management and testing of external Prometheus exporters (node and mysqld), and for the testing of internal exporters (Manager, Connector and Replicator).

      Issues: CT-960

    • A new tpm option purge-thl and a new script tungsten_purge_thl have been added to allow easier and more intelligent THL purging across all nodes in a topology.

      This allow you to purge THL files based on the following rules:

      • Gather the last applied seqno from all Replica nodes and take the lowest one

      • Find the current THL file which contains that seqno, then locate the previous one

      • Construct a thl purge command to remove thl thru the last seqno in the prev file

      The default behavior is to display the needed commands for the admin to execute manually.

      Issues: CT-1273

    • tpm diag now collects routing table information via route -n , and has two new command-line arguments: --include and --groups.

      --include specifies a comma-separated list of subroutines to include. Any gather subroutine not listed will be skipped.

      --groups specifies a comma-separated list of subroutine groups to include. Any group not listed will be skipped.

      Issues: CT-1322

    • A new sub-command has been added, tpm generate-haproxy-for-api. This read-only action will read all available INI files and dump out corresponding haproxy.cfg entries with properly incrementing ports; the composite parent will come first, followed by the composite children in alphabetical order.

      The tungsten_generate_haproxy_for_api script functionality has been merged into the tpm generate-haproxy-for-api command. The tungsten_generate_haproxy_for_api script remains as a shell wrapper for tpm generate-haproxy-for-api.

      Issues: CT-1342

    • tungsten_send_diag now supports a new command-line argument, --cleanup, which will cause the removal of the diagnostic archive file generated using the --diag argument.

      Issues: CT-1360

    • The tungsten_reset_manager command is now able to restart the Manager process when the --start or -s argument is passed in.

      Issues: CT-1401

    • With the release of APIv2, a new cli tool has been introduced to allow easier access called tapi.

      In addition, the vast majority of Tungsten cli tools have been updated to optionally use the APIv2 interface when desired.

      The Nagios and Zabbix checks are also available via APIv2 using the tapi tool.

      Issues: CT-1454

    • The tungsten_purge_thl command is now a wrapper for the tpm purge-thl command.

      Issues: CT-1488

    • The tmonitor command now has better help text and more options to ease usage, including --filter to allow easy viewing of the tmonitor test output.

      Issues: CT-1585

    • A new option to print the merged logs to STDOUT has been added to tungsten_merge_logs (--stdout|-O).

      The tpm command suite now properly supports the --profile argument to specify a Tungsten json configuration file in place of the installed tungsten.cfg.

      Issues: CT-1680

    • The tapi command now supports the --affinity argument which will display all Connector-specific affinity settings, along with --connectorstatus to show all.

      Issues: CT-1700

    • The cctrl.log file is now accessible from the $CONTINUENT_ROOT/service_logs directory

      Issues: CT-1727

    • A new command (error) has been added to trepctl to output a full stack trace of the last error, if any.

      shell> trepctl -service <serviceName> error
      
      Event application failed: seqno=10 fragno=0 message=Table hr.regions not found in database. Unable to generate a valid statement.
      com.continuent.tungsten.replicator.applier.ApplierException: Table hr.regions not found in database. Unable to generate a valid statement.
      at com.continuent.tungsten.replicator.applier.JdbcApplier.getTableMetadata(JdbcApplier.java:582)
      at com.continuent.tungsten.replicator.applier.JdbcApplier.fillColumnNames(JdbcApplier.java:494)
      at com.continuent.tungsten.replicator.applier.JdbcApplier.getColumnInformation(JdbcApplier.java:1236)
      at com.continuent.tungsten.replicator.applier.MySQLApplier.applyOneRowChangePrepared(MySQLApplier.java:418)
      at com.continuent.tungsten.replicator.applier.JdbcApplier.applyRowChangeData(JdbcApplier.java:1460)
      at com.continuent.tungsten.replicator.applier.JdbcApplier.apply(JdbcApplier.java:1576)
      at com.continuent.tungsten.replicator.applier.ApplierWrapper.apply(ApplierWrapper.java:100)
      at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.apply(SingleThreadStageTask.java:871)
      at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(SingleThreadStageTask.java:601)
      at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThreadStageTask.java:185)
      at java.base/java.lang.Thread.run(Thread.java:834)

      Issues: CT-1747

  • Backup and Restore

    • Added support for mariabackup. Configurable tpm option:

      backup-method=mariabackup
      or
      backup-method=mariabackup-incremental

      Issues: CT-1100

    • Fixes an issue where tprovision could remove the SSL certs for MySQL on reprovision

      Issues: CT-1323

  • Core Replicator

    • It is now possible to compress and/or encrypt THL on disk. For more information on using these features see https://docs.continuent.com/tungsten-clustering-7.0/thl-compress-encrypt.html

      Issues: CT-630

    • The replicator will now be able to handle new SQL_MODES available in later releases of MySQL and MariaDB, these are as follows:

      • MySQL: TIME_TRUNCATE_FRACTIONAL

      • MariaDB: TIME_ROUND_FRACTIONAL, SIMULTANEOUS_ASSIGNMENT

      Issues: CT-1362

    • In-Flight THL Compression is now available.

      For full details on enabling this feature, refer to this page

      Issues: CT-1420

  • Tungsten Connector

    • Audit Logging is now available in the Connector: it allows for logging data transferred between client application and MySQL servers to a file, database or socket.

      For more details and steps to enable, see https://docs.continuent.com/tungsten-clustering-7.0/connector-advanced-audit-logging.html

      Issues: CT-78

    • Added support for TLSv1.3.

      Note that TLS 1.3 was introduced in jdk 8u272. So tpm will only set it by default for java 8u272 and later versions. It is still possible to force the protocol to be used via tpm flag --tls-enabled-protocols=TLSv1.xxx

      Issues: CT-1367

    • A new Dynamic Active/Active feature has been added within the Proxy.

      This new Proxy mode is for Composite Active/Active clusters only, and allows you to specifically configure writes to be directed to a single cluster by the use of the tpm connector-write-affinity flag.

      For full details on this feature, and configuring it, see [link to be added]

      Issues: CT-1540

    • connector drain [opt-timeout] has been introduced as an alias to existing connector graceful-stop command and will prevent new connections, then shutdown the connector after an optional delay.

      The opt-timeout parameter (seconds) can be specified to limit the wait before stopping the connector. Not passing this parameter implies infinite wait.

      Issues: CT-1644

    • connector-reset-when-affinity-back is now available with proxy mode.

      Issues: CT-1763

  • Security

    • Tungsten can now install on default CentOS and RedHat 8 with tightened security settings thanks to support of TLSv1.3

      Issues: CT-1359

    • Tungsten software now undergoes a rigorous security scan during QA. We also check included open-source/3rd-party software.

      Issues: CT-1579

  • Monitoring

    • A number of new metrics have been added to the Prometheus exporters.

      Issues: CT-1266, CT-1615

  • Platform Specific Deployments

    • ARM 64 bit processor support added (linux aarch64)

      Note

      Note that at time of release, there is currently no xtrabackup binary available for ARM.

      Issues: CT-1619, CT-1620

  • Other Issues

    • General improvements in MySQL 8.0 support.

      Issues: CT-1346

    • IPv6 host addresses are now fully supported.

      Can be enabled with the following configuration property:

      prefer-ip-stack="6"

      By default, IPv4 is enabled, which equates to the value of "4" in the above property.

      Issues: CT-1537

Bug Fixes

  • Installation and Deployment

    • Default systemd configuration files for Tungsten components no longer specifies the tungsten group to execute the command. This will prevent file access issues when tungsten user belongs to several groups.

      Issues: CT-1550

    • When services are deployed with systemd and MySQL could not start due to an error, tpm would not be able to later start MySQL

      Issues: CT-1734

  • Command-line Tools

    • In certain cases, tprovision would not be able to find the binary log position of the backup when taken from a primary. This has been fixed.

      Issues: CT-1085

    • Fixes a bug in tprovision when using xtrabackup version 8, due to changes in xtrabackup binaries.

      Issues: CT-1248

    • The tpm connector command now handles special characters in the password string.

      Issues: CT-1258

    • The tpm update command will now exit with an error if any files not owned by the configured Tungsten OS user are found in the Tungsten installation directory.

      For example, if the OS user is tungsten and the installation directory is /opt/continuent, containing the file /opt/continuent/thl/archived_thl.zip owned by root would cause something like the following error to be produced, and tpm update would exit:

      Foreign-owned files found!
      
      Located files in the Tungsten installed directory /opt/continuent
      that are not owned by the Tungsten OS user (tungsten):
      
      /opt/continuent/thl/archived_thl.zip
      
      Please change the ownership of these files to OS user "tungsten"
      using the chown command as root via sudo, then rerun the `tpm update` command.
      
      For example:
      shell> sudo chown -R tungsten:tungsten /opt/continuent

      Issues: CT-1260

    • The tpm diag command now behaves correctly on Connector-only nodes, where previously it would try to gather Manager and Replicator-specific items.

      Issues: CT-1284

    • Fixes a security issue within the tpm diag command.

      Issues: CT-1295

    • The number of created and active connections could be incorrect when listed from a composite data service

      Issues: CT-1312

    • tungsten_send_diag no longer prints an error about Use of uninitialized value $diagArgs in concatenation.

      Issues: CT-1354

    • The tpm command no longer prints an error when run with no other command-line arguments.

      Issues: CT-1373

    • The tpm command no longer aborts with a Use of uninitialized value error when a stray tungsten.cfg file exists under $CONTINUENT_ROOT

      Issues: CT-1394

    • Fixes a monitoring bug with users using caching_sha2_password.

      Issues: CT-1406

    • tprovision (formerly tungsten_provision_slave) may fail to provision if the MySQL data directory was not accessible to the tungsten user.

      Issues: CT-1475

    • The tpm generate-haproxy-for-api command no longer fails on CentOS 8.

      Issues: CT-1484

    • The tmonitor command no longer fails on Debian 9 and Ubuntu.

      Issues: CT-1485

    • All tpm sub-commands now handle command-line arguments more intelligently.

      Issues: CT-1487

    • The tpm purge-thl command now handles command-line arguments more intelligently.

      Issues: CT-1489

    • The tpm diag command now properly collects the system information file on Debian systems.

      Issues: CT-1492

    • Database monitoring logs are now reporting the correct error number and SQL state when database errors occur.

      Issues: CT-1497

    • The tpm update command now handles updates/upgrades more gracefully when the previous version did not have the latest tpm framework.

      Issues: CT-1506

    • The tpm update command will now properly remove composite services.

      Issues: CT-1519

    • The tungsten_find_orphaned command no longer fails with an 'Can't exec "/bin/sh": Argument list too long' error when there are too many THL files to parse.

      Issues: CT-1545

    • The tpm ask command no longer calls Data::Dumper when it is not available.

      Issues: CT-1626

    • tpm now parses the MYSQL SSL related setting correctly.

      Issues: CT-1662

    • Fixes an issue where the deployall command would create a root owned wrapper.log in the ./tools directory.

      Issues: CT-1664

    • When MySql services where badly installed, some distribution could show a “not-found” status within systemctl, confusing tpm

      Issues: CT-1677

    • The tpm command now communicates properly when there is no INI configuration file or staging-method deploy.cfg configuration defined.

      Issues: CT-1712

    • The tpm diag command now handles Multi-Site/Active-Active topologies better.

      Issues: CT-1718

    • tungsten_monitor.rb script no longer uses sudo to send emails if the configuration doesn't allow it.

      Issues: CT-1737

    • The tpm diag command now handles zero-length mysqld.log files gracefully.

      Issues: CT-1740

  • Backup and Restore

    • When running the cluster_backup in a Active/Active environment, when setting require_master_backup to false, the script would still attempt to backup the Primary as it would scan the wrong sub-service and incorrectly identify the Relay node as a candidate.

      Issues: CT-1280

    • Fixed an issue where an xtrabackup generated by the replicator would fail to be restored using trepctl restore command.

      Issues: CT-1575

  • Core Replicator

    • The replicator metadata cache will now correctly handle table names when lower_case_table_names=1 is set in the MySQL configuration.

      Issues: CT-651

    • When using parallel apply, the replicator would error with a Foreign Key constraint error if statements were issued against two or more objects that shared the same name, but with different case sensitivity, for example:

      mysql> create table testtable;
      mysql> drop table testtable;
      mysql> create table TestTable;

      Issues: CT-1259

    • A change in the way MySQL logs CREATE TABLE AS SELECT in the Binary Logs from v8.0.20 onwards, meant these transactions would previously fail.

      Warning

      Whilst these statements will now replicate, it mut be noted that in the event of a failure during the data load, the initial CREATE statement won't be rolled back, and therefore care must be taken when using this type of DDL.

      Note

      This only affects customers using MySQL v8.0.20+ running with ROW based replication. An alternative workaround to ensure correct rollback on failure, would be to run the statement with STATEMENT based replication for the session. This will also provide better performance for larger tables.

      Issues: CT-1301

    • Fixes occurences of NullPointerException that would occur when bringing the replicator online before MySQL was started.

      Issues: CT-1348

    • For row based events, SQL modes were not displayed in the THL output. This is now fixed.

      Issues: CT-1440

    • When connecting to a THL server, a client will now connect to the next available host in its THL uri, if the first does not have the sequence number that the client requires. The client will then fail only if none of the hosts from the uri can provide the needed sequence number.

      Issues: CT-1558

    • Fixed an issue when using Parallel apply that would show a NullPointerException in case an event could either not be found or be corrupted in THL. This will now display a correct message Missing or corrupted event from storage

      Issues: CT-1722

    • Fixed an issue where trepctl was leaving JMX connections opened.

      Issues: CT-1752

    • Added more debug information for detecting possible hanging connections while a THL client connects to the THL server. Also, added socket timeout for the connection initialization

      Issues: CT-1760

  • Filters

    • Includes previously missing template file to enable easy configuration of the dbrename filter.

      Issues: CT-1350

    • The BidiRemoteSlaveFilter could fail to correctly flag fragmented events in unprivileged environments (Aurora, for example) In such an environment (multi-active, unprivileged database access), a new setting was introduced to force extraction process to read ahead to the last fragment to detect the service name (false by default). Enabled with repl_svc_extractor_multi_frag_service_detection=true

      Issues: CT-1351

  • Tungsten Connector

    • By default, the connector will no longer transparently reconnect underlying connections to database servers when the data service changes.

      This will prevent the following case: in Composite Active/Active topologies, a given connection starts to write data to a site. The site fails, connection gets reconnected the other site and resumes writing. However the data written to the 1st site has not reached the 2nd site, thus data will not be consistent.

      Default is to reject reconnections that follow a write operation (RW_STRICT connection or SmartScale after a write) and to allow reconnection after a read operation (RO_RELAXED or SmartScale after a read) which translates to --connector-allow-cross-site-reconnects-for-writes=false and --connector-allow-cross-site-reconnects-for-reads=true

      It is still possible to get the previous behavior (reconnecting transparently connections cross-site) by specifying both --connector-allow-cross-site-reconnects-for-writes=true and --connector-allow-cross-site-reconnects-for-reads=true, at your own risk

      Issues: CT-1265

    • With @direct r/w splitting in active/active configurations, the connector was not correctly redirecting connections to an active site after re-joining the cluster following a failure.

      Issues: CT-1400

    • The Connector now properly retries specific MySQL commands when possible: INIT_DB, CHANGE_USER, STATISTICS and prepared statement functions.

      Issues: CT-1480

    • Connector now forbids non-ssl connections when mysql server has require_secure_transport=ON

      Issues: CT-1666

    • Connector now mirrors the MySQL default connect_timeout by retrieving it from the primary when starting up. This timeout will apply to all connections made from the connector to MySQL servers.

      This setting can be over-ridden by using the following tpm property

      property=connectTimeout=VALUE

      If VALUE set to autodetect, this value will mirror the MySQL connect_timeout system variable. Set to 0 for infinite timeout.

      Issues: CT-1726

    • Connector no longer requests cross-site services information from manager. In addition to removing extra network traffic, this cures a problem of connections counted twice in data source connection statistics

      Issues: CT-1775

  • Tungsten Manager

    • Fixes a NullPointerException error (NPE) in the manager logs.

      Issues: CT-1132

    • Fixes an edge case bug that would allow a Composite Active/Active cluster to contain 2 Primary nodes.

      Issues: CT-1474

    • Fixes a manager-internal connection check during recovery that was not properly using SSL when required.

      Issues: CT-1661

    • Early initialisation of the REST API sometimes caused the manager to hang and fail to startup correctly on new installations.

      Issues: CT-1725

    • Fixes an issue when the manager would wait longer than the timeout for replicator purge on a failover.

      Issues: CT-1733

    • Fixes an issue where datasource <datasourcename> welcome would fail to welcome a manually failed composite datasource.

      Issues: CT-1771

  • Security

    • log4j libraries updated to v2.17.1 specifically to mitigate risk of exposure to the 0-day vulnerbaility detected in log4j v2.14

      Issues: CT-1703