1.1. Tungsten Clustering 7.0.2 GA (Not Yet Release)

Version End of Life. Not Yet Set

Release 7.0.2 will contain a number of bug fies and improvements.

Behavior Changes

The following changes have been made to Tungsten Cluster and may affect existing scripts and integration tools. Any scripts or environment which make use of these tools should check and update for the new configuration:

  • Command-line Tools

    • The various user-xxxx.log files are no longer generated

      Issues: CT-1914

    • The check_tungsten.sh was deprecated in release 6.1.18 and has now been removed from this release.

      Issues: CT-1939

  • Core Replicator

    • repl_svc_extractor_multi_frag_service_detection is now turned ON by default. Event shards are determined at extraction time. With fragmented events, the shard cannot be determined by only reading the first fragment, but needs to check the last fragment as well. With this setting turned OFF, there is no issue with pipelines that don't need it, i.e. no parallel apply downstream replicas. However, as this is done at extract time, THL contains this information, and adding or changing a replica using parallel apply could introduce issues.

      Note

      It can be disabled if you see a performance overhead but this should be done with caution. For Aurora<>Aurora Active/Active deployments it is essential that this property be left ON.

      Issues: CT-1959

  • Tungsten Connector

    • Improved tungsten show processlist by running underlying commands in parallel.

      Issues: CT-1569

  • Tungsten Manager

    • A failsafe shunned cluster (Caused by a network split) will be auto recovered after the network connection is re-established.

      Issues: CT-241

Improvements, new features and functionality

  • Command-line Tools

    • A new -c option is now available with some trepctl commands that can be used in conjunction with the -r option to indicate the number of times to refresh before automatically terminating. For example, the following command:

      shell> trepctl perf -r 3 -c 10

      Will refresh the output every 3 seconds, 10 times.

      Issues: CT-679

    • The tungsten_merge_logs command now supports the --before TIMESTAMP and --after TIMESTAMP filters

      Issues: CT-1869

    • The tpm ask summary command now provides the coordinator host and the isCoordinator boolean if the Manager is running on that node.

      Also, tpm ask now supports direct calls to coordinator, \{isCoordinator|iscoordinator} and \{isBridgeMode|isBridge|bridge|isbridge|isbridgemode}.

      Issues: CT-1874

    • The tungsten_generate_haproxy_for_api and tpm generate-haproxy-for-api commands now support using connector hosts in the backend definitions via -c, and extra backend flags to the backend hosts lines using -f.

      Issues: CT-1909

    • The tungsten_generate_haproxy_for_api and tpm generate-haproxy-for-api commands no longer call the Perl Data::Dumper module.

      Issues: CT-1915

    • The tungsten_reset_manager command now supports the ability to simply print out the path or paths to be cleared, one per line via the -l or --list arguments.

      Issues: CT-1917

    • The tmonitor command now accepts cli args to specify the ports and will auto-configure the ports if they have been changed via the Tungsten configuration.

      Issues: CT-1919

    • The tpm command calls to glob have been improved to be more strict and compliant.

      Issues: CT-1940

    • The tpm ask stages and tpm ask allstages commands have been added to display the Replicator stages for the current node (stages) and the stages for each role (allstages).

      Issues: CT-1943

    • The tpm ask command has five new variables available: dsrole & dsstate for the current datasource, and trrole & trstate for the current replicator, and nodeinfo which displays all 4 of the new variables.

      Issues: CT-1944

    • A new standalone status script has been added called tungsten_get_status that shows the datasources and replicators for all nodes in all services along with seqno and latency.

      Issues: CT-1962

  • Core Replicator

    • Added a new feature that enables pausing a replicator stage for some amount of time.

      This will pause the given stage for 100 seconds.:

      trepctl pause -stage thl-to-q -time 100

      This will pause the stage indefinitely (or until restart, etc) Add -y to avoid the prompt message whether you are sure.

      trepctl pause -stage thl-to-q

      For the previous 2 commands, running a pause command again will override the previous command.

      This will resume the suspended stage (Note that if the stage is not paused, this will have no effect):

      trepctl resume -stage thl-to-q

      Note

      Please note this pause does not survive a replicator restart or a service offline/online.

      Issues: CT-1912

    • Per-service tuning of the replicator thl directory is now possible for multi-service replicator-only installs as well as for clustering. The given value should be the base directory, to which tungsten will add the service name. For example, the following entry in the tungsten.ini:

      [alpha]
      ...
      ...
      thl-directory=/drv1/thl
      ...

      Would result in the THL being placed in /drv1/thl/alpha

      Note

      Update of thl directory is only available when tpm is called from the staging installation directory, NOT from the running directory.

      Issues: CT-1927

    • A new replicator role (thl-applier) has been added to allow a replicator service to apply its locally available THL, without pulling from a remote host

      Issues: CT-1936

  • Tungsten Connector

    • The connector graceful-stop command now supports systemd service manager properly. The connector stop command now takes an optional argument that will make it a graceful stop. If connector stop is run without the parameter, it will stop the connector immediately. If a positive number of seconds is passed, it will wait, at most, this timeout for connections to disconnect (refusing new connections), after which it will force close all connections and shutdown the connector. connector graceful-stop behavior is unchanged: without the parameter, the connector will wait "forever" for connections to disconnect. A positive timeout in seconds can be passed to sever connections after the given delay

      Issues: CT-1921

  • Tungsten Manager

    • Added a new option to TPM manager-replicator-offline-timeout=<timeout_in_sec> that configures the timeout for the manager to wait until the replicator goes offline. When parallel applier is in use the default timeout was too low, so it’s now user configurable so that it can be adjusted to suit different topologies. If not supplied, the default is 180 (3 minutes). This value should be sufficent in most use cases.

      Issues: CT-1892

Bug Fixes

  • Installation and Deployment

    • ddlscan, dsctl and tungsten_send_diag are now added to the aliases.sh script.

      Issues: CT-813

    • Fixes issues where fixed properties and filters passed to tpm in service stanzas were not being configured correctly

      Issues: CT-1463

    • No longer using Tanuki wrapper functionality to print jvm version, which was creating defunct java processes at startup, now using internal code.

      Issues: CT-1876

    • The tpm install and tpm update commands now properly support the --thl-ports option for cross-site subservices.

      Issues: CT-1953

  • Command-line Tools

    • The tungsten_skip_seqno command no longer fails when -i is specified, and now properly filters using --filter when there is a long error message.

      Issues: CT-1877

    • The tpm command now allows any case for section entries (i.e. [alpha_FROM_beta]) in the INI files.

      Issues: CT-1879

    • The tpm diag command now passes when the nodename defined in the tungsten.ini is the shortname, and DNS returns the FQDN.

      Issues: CT-1908

    • The tpm diag command now gathers the mysql.log file when SSL is enabled in the server.

      Issues: CT-1920

    • Fixes an issue that prevented dsctl from connecting to MySQL if SSL was enabled.

      Issues: CT-1928

    • The tpm mysql command will now gracefully handle being run on a non-database node.

      Issues: CT-1946

  • Core Replicator

    • Fixes an issue that would prevent a service from going offline at a specified time (trepctl online -until-time) when parallel apply is enabled. This is a rework of CT-1243.

      Issues: CT-1684

    • Fixed a possible issue when recovering an old primary as a replica after failover when parallel apply is enabled, that could lead the replica to be unable to come online and require a reprovisioning of this replica.

      Issues: CT-1890

    • Fixes an issue that prevented geometry datatypes with SRID from being replicated.

      Issues: CT-1904

    • Fixed an issue where filtered events would trigger a useless update to the service trep_commit_seqno table while it is overwritten anyway once the last statement of the applied event is done, just prior to committing the whole block.

      Issues: CT-1931

    • Fixed an issue where the replicator would hang after applying a DROP TABLE event, that originally failed on the primary, but got logged into the binlog.

      Issues: CT-1973

  • Tungsten Connector

    • Connector now auto detects default authentication plugin by retrieving MySQL data source variable default_authentication_plugin rather than just using MySQL server version

      Issues: CT-1926

    • Fixed connector logging configuration to show hostname and class printing logs

      Issues: CT-1965

  • Tungsten Manager

    • The cctrl command datasource <ds> slave now sets the replicator role correctly. Previously, only the datasource role would change.

      Issues: CT-1882

  • API

    • Calls to /api/v2/manager/cluster/status now return properly when a peer cluster is fully offline or unreachable.

      Issues: CT-1945