A.5. Tungsten Clustering 7.1.0 GA (16 Aug 2023)

Version End of Life. Not Yet Set

Release 7.1.0 is the next major v7 release containing a number of important bug fixes and key new features.

Warning

Due to JGroup libraries being updated in this release, managers running releases older than 7.1.0 will not communicate with managers running 7.1.0+ therefore when upgrading to this release, all nodes must be upgraded before proper cluster communication will be restored. Ensure the cluster is in MAINTENANCE before beginning the upgrade and do NOT SHUN nodes whilst a mix of manager versions are running.

Behavior Changes

The following changes have been made to Tungsten Cluster and may affect existing scripts and integration tools. Any scripts or environment which make use of these tools should check and update for the new configuration:

  • Command-line Tools

    • No longer using connector graceful-stop together with systemd for upgrades. The underlying change of binary confuses systemd scripts.

      Issues: CT-2113

    • cctrl now accepts services names with capital letters, dots and hyphens

      Issues: CT-2163

    • The tpm copy-keys command has been renamed to tpm copy and a new command has been created tpm cert copy with the same functionality.

      Issues: CT-2186

  • Backup and Restore

    • When performing a provision via rsync, tprovision will now sleep for 2 seconds after locking tables to make sure all transactions have finished writing to disk.

      Issues: CT-2169

Improvements, new features and functionality

  • Behavior Changes

    • Additional logging will now be added to the replicator logs during switchover/failover operations to enable better debugging in the event of issues.

      Issues: CT-1448

  • Installation and Deployment

    • Running tpm uninstall will now save all of the Tungsten database tracking schemas for later use. There is also a new tpm keep command, which allows the tracking schemas to be saved to disk at any time in multiple formats (.json, .dmp and .cmd)

      Issues: CT-2131

    • Added tpm flag deploy-systemd as a more meaningful alias to install

      Issues: CT-2152

  • Command-line Tools

    • Added a new option --preserve-schema to the tpm uninstall command in order to leave the tracking schema in the database.

      Issues: CT-561

    • A new command tpm cert has been added to aid in the creation, rotation and management of certificates for all areas of Tungsten.

      Known limitations: Percona 5.6, and all versions of MariaDB do not provide the mysql_ssl_rsa_setup command required by tpm cert gen mysqlcerts.

      Issues: CT-2085

    • The tpm report command now displays the security-specific information for each channel shown, including file paths and tpm options. Security information and tpm options for each channel will also be shown when --extra is used with --list.

      Also added the tpm ask certs command with expiry and sha2566 info per alias, along with the tpm ask certtpm and tpm ask certlocations commands with reference information about the security.properties file.

      Added help text for tpm ask --long which shows the key/variable name along with the value. The tmonitor -t test command now shows only the actual Tungsten-specific metrics lines.

      Added tmonitor -T test to show metrics help and headers along with actual tungsten_* metrics lines.

      Issues: CT-2088

    • tungsten show processlist could error with a disconnection message listing connections disconnected on the mysql server side.

      Issues: CT-2112

    • The tpm diag command now captures the output of cctrl> cluster topology validate

      Issues: CT-2115

    • The tpm diag command now supports the --skipsudo and --nosudo arguments to prevent operations from using the sudo command. Using this option may result in tpm diag skipping/failing various gathers due to a lack of access.

      Issues: CT-2146

    • The tungsten_send_diag command has a new argument --all which will tell the tpm diag command to gather all hosts with -a, and this replaces the previous method for gathering all hosts for tungsten_send_diag, --args '--all'

      Issues: CT-2150

  • Backup and Restore

    • MySQL clone can now be used as an option for recovery using tprovision.

      Issues: CT-1417

    • tungsten_get_mysql_datadir can now return additional mysql database directories.

      Issues: CT-1985

    • tprovision will now add a heartbeat after restore. This will ensure the replicator can be put back online when there is no load on the cluster.

      Issues: CT-2005

    • The tprovision script will sleep for 5 seconds by default when using the rsync method after issuing a flush logs. The sleep value is configurable as a command line option --flush-after-sleep.

      Issues: CT-2101

  • Filters

    • A new shardbyrules filter has been added that will allow rule based sharding of replication based on user confgurable rules that would allow sharding at table level, whereas previoulsy sharding would only be handled at schema level.

      For more information, see shardbyrules Filter Documentation

      Issues: CT-2164

  • Tungsten Connector

    • Tungsten Connector now supports dual passwords, mirroring the MySQL v8.0.14+ functionality. When changing a user password, the previous password can be retained as long as needed in order to allow changing account passwords with no downtime.

      Issues: CT-2127

    • Added a flag to avoid generating and sending EOF packets to client applications when CLIENT_DEPRECATE_EOF is not set on both client and mysql server sides. This fixes an issue with Go-MySQL-Driver and prepared statements.

      When using Go MySQL Driver, flag --connector-generate-eof=false should be specified. Default is set to true for backwards compatibility.

      Issues: CT-2177

  • Core Clustering

    • Upgraded JGroups library to 4.2.22

      Issues: CT-2011

    • There is a new datasource-group-id TPM option. In a single cluster the nodes with the same datasource-group-id will form a Distributed Datasource Group (DDG).

      Issues: CT-2051

  • Monitoring

    • Prometheus exporters now provide the ssl cert expiration date as an epoch value in addition to the label.

      Issues: CT-2099

    • Added Prometheus exporter metrics for composite parent and sub-services.

      Issues: CT-2121

    • Prometheus libraries have been upgraded from version 0.8.1 to 0.16.0

      Issues: CT-2166

    • The ability to configure the Java Virtual Machine (JVM) settings for the manager has been made easier, by the use of the manager_java_settings.conf file. For more information see Section 8.7, “Adjusting JVM Settings for the Manager”

      Issues: CT-2167

Bug Fixes

  • Installation and Deployment

    • Fixed RPM package script to run tpm install instead of tpm update when installing the rpm

      Issues: CT-2130

  • Command-line Tools

    • The tpm command now handles situations when the Manager process is not running.

      Issues: CT-2103

    • tpm uninstall would sometimes print "ERROR >> db1 >> undefined method '+' for nil:NilClass"

      Issues: CT-2104

    • The tungsten_find_orphaned command now handles some edge cases more gracefully.

      Issues: CT-2107

    • The cctrl cluster topology validate command now checks, in a CAA setup, if the relay in the subservice is on the same host as the primary, and reports if there is a mismatch.

      Issues: CT-2114

    • The tpm command now searches more places to locate shell commands that are called, especially useful when $CONTINUENT_ROOT/share/env.sh is not sourced.

      Issues: CT-2182

  • Backup and Restore

    • tprovision would produce errors if the local hostname were different from the hostname used in the Tungsten install (short vs long names).

      Issues: CT-1363

    • tprovision will now print an error message and exit if the MySQL datadir does not exit.

      Issues: CT-1901

    • tprovision would accept bogus options and not produce an error. This has now been fixed.

      Issues: CT-2045

    • tprovision will now timeout if ssh is blocked from the target to the source host.

      Issues: CT-2139

    • Using mysqldump for tprovision could incorrectly create a new SSL key pair.

      Issues: CT-2142

  • Core Replicator

    • Improved a query that is run by Tungsten when fetching tables metadata (column names, datatypes, etc). While it is not generally needed, the unoptimized query can run badly (especially) against old mysql versions with a lot of databases / tables. For now, the new optimized query is not used by default, but this could change in some future version.

      This can be enabled by using the following property :

      property=replicator.datasource.global.connectionSpec.usingOptimizedMetadataQuery=true

      Issues: CT-2077

    • Fixed an issue while processing geometry data with SRID 4326 that would swap longitude and latitude. This applies only to MySQL 8, as prior MySQL versions do not allow specifying the order when applying a WKB (Well-known binary) to MySQL

      Issues: CT-2172

  • Tungsten Connector

    • Fixed NullPointerExceptions when reading packets from disconnected client connections.

      Issues: CT-2132

    • Fixed a log4j configuration issue where the connector-audit.log and connector-api.log files that have been rotated will end up in cluster-home/bin/directory instead of tungsten-connector/log/.

      Issues: CT-2137

    • Fixed an issue with topology validation when running tpm promote-connector on a witness host

      Issues: CT-2161

    • Fixed an issue where a failover could hang in rare cases when security and Proxy mode are enabled. SSL connections to a failed data source could enter a deadlock and block failover for up to 15 minutes.

      Note

      This issue only affects users running Java 11, and is related to https://bugs.openjdk.org/browse/JDK-8241239

      Issues: CT-2183, CT-2187

  • Core Clustering

    • In a Composite Active/Active topology, issuing SHUN/DRAIN or WELCOME on a node in cctrl would only affect the node in the main cluster service. This action will now also be applied to the same node within the x_from_y sub-service.

      Issues: CT-2145

  • Tungsten Manager

    • A set of changes to improve cctrl responsiveness

      • cctrl now properly lists responsive connectors even if some fail to return their status in a timely fashion (for example with slow networks)

      • Under the above condition, cctrl will return in 5-6 seconds rather than in up to 30 seconds when there is network congestion/partition.

      • Under normal conditions, cctrl responds significantly more quickly - up to 3x faster - due to optimized communications between cctrl and remote connectors.

      • cctrl commands are no longer slowed down/blocked by internal ping traffic to connectors.

      • Fixed an issue where individual connectors cannot be addressed in 'router' commands.

      Issues: CT-1795