Copyright © 2023 Continuent Ltd
Abstract
This manual documents Tungsten Clustering (for MySQL), a high performance, High Availability and Disaster Recovery for MySQL clustering.
This manual includes information for 7.1, up to and including 7.1.4.
Build date: 2024-11-28 (505168f6)
Up to date builds of this document: Tungsten Clustering (for MySQL) 7.1 Manual (Online), Tungsten Clustering (for MySQL) 7.1 Manual (PDF)
Table of Contents
user.map
File Formatuser.map
Direct Routinguser.map
Host Optionsuser.map
Updatesuser.map
Entries from a
Scriptuser.map
Datauser.map
Datauser.map
Limitationsddl-check-pkeys.vm
ddl-mysql-hive-0.10.vm
ddl-mysql-hive-0.10-staging.vm
ddl-mysql-hive-metadata.vm
ddl-mysql-oracle.vm
ddl-mysql-oracle-cdc.vm
ddl-mysql-redshift.vm
ddl-mysql-redshift-staging.vm
ddl-mysql-vertica.vm
ddl-mysql-vertica-staging.vm
ddl-oracle-mysql.vm
ddl-oracle-mysql-pk-only.vm
env.sh Script
INI
tpm Methods{typeSpec}
definitions{passwordSpec}
definitionsansiquotes.js
Filterbreadcrumbs.js
Filterdbrename.js
Filterdbselector.js
Filterdbupper.js
Filterdropcolumn.js
Filterdropcomments.js
Filterdropddl.js
Filterdropmetadata.js
Filterdroprow.js
Filterdropstatementdata.js
Filterdropsqlmode.js
Filterdropxa.js
Filterforeignkeychecks.js
Filterinsertsonly.js
Filtermaskdata.js
Filternocreatedbifnotexists.js
Filtershardbyrules.js
Filtershardbyseqno.js
Filtershardbytable.js
Filtertosingledb.js
Filtertruncatetext.js
Filterzerodate2null.js
FiltermasterConnectUri
masterListenUri
accessFailures
active
activeSeqno
appliedLastEventId
appliedLastSeqno
appliedLatency
applier.class
applier.name
applyTime
autoRecoveryEnabled
autoRecoveryTotal
averageBlockSize
blockCommitRowCount
cancelled
channel
channels
clusterName
commits
committedMinSeqno
criticalPartition
currentBlockSize
currentEventId
currentLastEventId
currentLastFragno
currentLastSeqno
currentTimeMillis
dataServerHost
discardCount
doChecksum
estimatedOfflineInterval
eventCount
extensions
extractTime
extractor.class
extractor.name
filter.#.class
filter.#.name
filterTime
flushIntervalMillis
fsyncOnFlush
headSeqno
intervalGuard
lastCommittedBlockSize
lastCommittedBlockTime
latestEpochNumber
logConnectionTimeout
logDir
logFileRetainMillis
logFileSize
maxChannel
maxDelayInterval
maxOfflineInterval
maxSize
maximumStoredSeqNo
minimumStoredSeqNo
name
offlineRequests
otherTime
pendingError
pendingErrorCode
pendingErrorEventId
pendingErrorSeqno
pendingExceptionMessage
pipelineSource
processedMinSeqno
queues
readOnly
relativeLatency
resourcePrecedence
rmiPort
role
seqnoType
serializationCount
serialized
serviceName
serviceType
shard_id
simpleServiceName
siteName
sourceId
stage
started
state
stopRequested
store.#
storeClass
syncInterval
taskCount
taskId
timeInCurrentEvent
timeInStateSeconds
timeoutMillis
totalAssignments
transitioningTo
uptimeSeconds
version
List of Figures
List of Tables
--output
Optioncondrestart
console
restart
start
tungsten
Sub-Directory StructureTable of Contents
This manual documents Tungsten Cluster 7.1 up to and including 7.1.4 build 10. Differences between minor versions are highlighted stating the explicit minor release version, such as 7.1.4.
For other versions and products, please use the appropriate manual.
The trademarks, logos, and service marks in this Document are the property of Continuent or other third parties. You are not permitted to use these Marks without the prior written consent of Continuent or such appropriate third party. Continuent, Tungsten, uni/cluster, m/cluster, p/cluster, uc/connector, and the Continuent logo are trademarks or registered trademarks of Continuent in the United States, France, Finland and other countries.
All Materials on this Document are (and shall continue to be) owned exclusively by Continuent or other respective third party owners and are protected under applicable copyrights, patents, trademarks, trade dress and/or other proprietary rights. Under no circumstances will you acquire any ownership rights or other interest in any Materials by or through your access or use of the Materials. All right, title and interest not expressly granted is reserved to Continuent.
All rights reserved.
This documentation uses a number of text and style conventions to indicate and differentiate between different types of information:
Text in this style is used to show an important element or piece of information. It may be used and combined with other text styles as appropriate to the context.
Text in this style is used to show a section heading, table heading, or particularly important emphasis of some kind.
Program or configuration options are formatted using
this style
. Options are also
automatically linked to their respective documentation page when this is
known. For example, tpm and
--hosts
both link automatically to the
corresponding reference page.
Parameters or information explicitly used to set values to commands or
options is formatted using this
style
.
Option values, for example on the command-line are marked up using this
format: --help
. Where possible, all
option values are directly linked to the reference information for that
option.
Commands, including sub-commands to a command-line tool are formatted using Text in this style. Commands are also automatically linked to their respective documentation page when this is known. For example, tpm links automatically to the corresponding reference page.
Text in this style
indicates
literal or character sequence text used to show a specific value.
Filenames, directories or paths are shown like this
/etc/passwd
. Filenames and paths
are automatically linked to the corresponding reference page if
available.
Bulleted lists are used to show lists, or detailed information for a list of items. Where this information is optional, a magnifying glass symbol enables you to expand, or collapse, the detailed instructions.
Code listings are used to show sample programs, code, configuration files and other elements. These can include both user input and replaceable values:
shell>cd /opt/continuent/software
shell>ar zxvf
tungsten-clustering-7.1.4-10.tar.gz
In the above example command-lines to be entered into a shell are prefixed
using shell
. This shell is typically
sh,
ksh, or
bash on Linux and Unix platforms.
If commands are to be executed using administrator privileges, each line will be prefixed with root-shell, for example:
root-shell> vi /etc/passwd
To make the selection of text easier for copy/pasting, ignorable text, such
as shell>
are ignored during
selection. This allows multi-line instructions to be copied without
modification, for example:
mysql>create database test_selection;
mysql>drop database test_selection;
Lines prefixed with mysql>
should
be entered within the mysql
command-line.
If a command-line or program listing entry contains lines that are two wide to be displayed within the documentation, they are marked using the » character:
the first line has been extended by using a » continuation line
They should be adjusted to be entered on a single line.
Text marked up with this style
is information that is
entered by the user (as opposed to generated by the system). Text formatted
using this style
should be replaced with the
appropriate file, version number or other variable information according to
the operation being performed.
In the HTML versions of the manual, blocks or examples that can be userinput can be easily copied from the program listing. Where there are multiple entries or steps, use the 'Show copy-friendly text' link at the end of each section. This provides a copy of all the user-enterable text.
Are you planning on completing your first installation?
Do you know the Section 2.2, “Requirements”?
Have you followed the Appendix B, Prerequisites?
Have you decided which installation method you will use? INI or Staging?
Have you chosen your deployment type from Chapter 2, Deployment? Is this a Primary/Replica deployment?
Would you like to understand the different types of installation?
There are two installation methods available in tpm, INI and Staging. A comparison of the two methods is at
Do you want to upgrade to the latest version?
Are you trying to update or change the configuration of your system?
Has your system suffered a failure?
For recovery methods and instructions, see Section 6.6, “Datasource Recovery Steps”.
Would you like to perform database or operating system maintenance?
Do you need to backup or restore your system?
For backup instructions, see Section 6.10, “Creating a Backup”, and to restore a previously made backup, see Section 6.11, “Restoring a Backup”.
Table of Contents
Tungsten Clustering™ provides a suite of tools to aid the deployment of database clusters using MySQL. A Tungsten Cluster™ consists of three primary tools:
Tungsten Replicator
Tungsten Replicator supports replication between different databases. Tungsten Replicator acts as a direct replacement for the native MySQL replication, in addition to supporting connectivity to Oracle, MongoDB, Vertica and others.
Tungsten Manager
The Tungsten Manager is responsible for monitoring and managing a Tungsten Cluster dataservice. The manager has a number of control and supervisory roles for the operation of the cluster, and acts both as a control and a central information source for the status and health of the dataservice as a whole.
Tungsten Connector (or Tungsten Proxy)
The Tungsten Connector is a service that sits between your application server and your MySQL database. The connector routes connections from your application servers to the datasources within the cluster, automatically distributing and redirecting queries to each datasource according to load balancing and availability requirements.
While there is no specific SLA because every customer’s environment is different, we strive to deliver a very low RTO and a very high RPO. For example, a cluster failover normally takes around 30 seconds depending on load, so the RTO is typically under 1 minute. Additionally, the RPO is 100%, since we keep copies of the database on Replica nodes, so that a failover happens with zero data loss under the vast majority of conditions.
Tungsten Cluster uses key terminology for different components in the system. These are used to distinguish specific elements of the overall system at the different levels of operations.
Table 1.1. Key Terminology
Continuent Term | Traditional Term | Description |
---|---|---|
composite dataservice | Multi-Site Cluster | A configured Tungsten Cluster service consisting of multiple dataservices, typically at different physical locations. |
dataservice | Cluster | The collection of machines that make up a single Tungsten Dataservice. Individual hosts within the dataservice are called datasources. Each dataservice is identified by a unique name, and multiple dataservices can be managed from one server. |
dataserver | Database | The database on a host. |
datasource | Host or Node | One member of a dataservice and the associated Tungsten components. |
staging host | - | The machine (and directory) from which Tungsten Cluster™ is installed and configured. The machine does not need to be the same as any of the existing hosts in the dataservice. |
active witness | - | A machine in the dataservice that runs the manager process but is not running a database server. This server will be used to establish quorum in the event that a datasource becomes unavailable. |
passive witness | - | A witness host is a host that can be contacted using the ping protocol to act as a network check for the other nodes of the cluster. Witness hosts should be on the same network and segment as the other nodes in the dataservice. |
coordinator | The datasource or active witness in a dataservice that is responsible for making decisions on the state of the dataservice. The coordinator is usually the member that has been running the longest. It will not always be the Primary. When the manager process on the coordinator is stopped, or no longer available, a new coordinator will be chosen from the remaining members. |
Tungsten Replicator is a high performance replication engine that works with a number of different source and target databases to provide high-performance and improved replication functionality over the native solution. With MySQL replication, for example, the enhanced functionality and information provided by Tungsten Replicator allows for global transaction IDs, advanced topology support such as Composite Active/Active, star, and fan-in, and enhanced latency identification.
In addition to providing enhanced functionality Tungsten Replicator is also capable of heterogeneous replication by enabling the replicated information to be transformed after it has been read from the data server to match the functionality or structure in the target server. This functionality allows for replication between MySQL and a variety of heterogeneous targets.
Understanding how Tungsten Replicator works requires looking at the overall replicator structure. There are three major components in the system that provide the core of the replication functionality:
Extractor
The extractor component reads data from a MysQL data server and writes that information into the Transaction History Log (THL). The role of the extractor is to read the information from a suitable source of change information and write it into the THL in the native or defined format, either as SQL statements or row-based information.
Information is always extracted from a source database and recorded within the THL in the form of a complete transaction. The full transaction information is recorded and logged against a single, unique, transaction ID used internally within the replicator to identify the data.
Applier
Appliers within Tungsten Replicator convert the THL information and apply it to a destination data server. The role of the applier is to read the THL information and apply that to the data server.
The applier works with a number of different target databases, and is responsible for writing the information to the database. Because the transactional data in the THL is stored either as SQL statements or row-based information, the applier has the flexibility to reformat the information to match the target data server. Row-based data can be reconstructed to match different database formats, for example, converting row-based information into an Oracle-specific table row, or a MongoDB document.
Transaction History Log (THL)
The THL contains the information extracted from a data server. Information within the THL is divided up by transactions, either implied or explicit, based on the data extracted from the data server. The THL structure, format, and content provides a significant proportion of the functionality and operational flexibility within Tungsten Replicator.
As the THL data is stored additional information, such as the metadata and options in place when the statement or row data was extracted are recorded. Each transaction is also recorded with an incremental global transaction ID. This ID enables individual transactions within the THL to be identified, for example to retrieve their content, or to determine whether different appliers within a replication topology have written a specific transaction to a data server.
These components will be examined in more detail as different aspects of the system are described with respect to the different systems, features, and functionality that each system provides.
From this basic overview and structure of Tungsten Replicator, the replicator allows for a number of different topologies and solutions that replicate information between different services. Straightforward replication topologies, such as Primary/Replica are easy to understand with the basic concepts described above. More complex topologies use the same core components. For example, Composite Active/Active topologies make use of the global transaction ID to prevent the same statement or row data being applied to a data server multiple times. Fan-in topologies allow the data from multiple data servers to be combined into one data server.
Tungsten Replicator operates by reading information from the source database and transferring that information to the Transaction History Log (THL).
Each transaction within the THL includes the SQL statement or the row-based data written to the database. The information also includes, where possible, transaction specific options and metadata, such as character set data, SQL modes and other information that may affect how the information is written when the data is applied. The combination of the metadata and the global transaction ID also enable more complex data replication scenarios to be supported, such as Composite Active/Active, without fear of duplicating statement or row data application because the source and global transaction ID can be compared.
In addition to all this information, the THL also includes a timestamp and a record of when the information was written into the database before the change was extracted. Using a combination of the global transaction ID and this timing information provides information on the latency and how up to date a dataserver is compared to the original datasource.
Depending on the underlying storage of the data, the information can be reformatted and applied to different data servers. When dealing with row-based data, this can be applied to a different type of data server, or completely reformatted and applied to non-table based services such as MongoDB.
THL information is stored for each replicator service, and can also be exchanged over the network between different replicator instances. This enables transaction data to be exchanged between different hosts within the same network or across wide-area-networks.
The Tungsten Manager is responsible for monitoring and managing a Tungsten Cluster dataservice. The manager has a number of control and supervisory roles for the operation of the cluster, and acts both as a control and a central information source for the status and health of the dataservice as a whole.
Primarily, the Tungsten Manager handles the following tasks:
Monitors the replication status of each datasource (node) within the cluster.
Communicates and updates Tungsten Connector with information about the status of each datasource. In the event of a change of status, Tungsten Connectors are notified so that queries can be redirected accordingly.
Manages all the individual components of the system. Using the Java JMX system the manager is able to directly control the different components to change status, control the replication process, and
Checks to determine the availability of datasources by using either the Echo TCP/IP protocol on port 7 (default), or using the system ping protocol to determine whether a host is available. The configuration of the protocol to be used can be made by adjusting the manager properties. For more information, see Section B.2.3.3, “Host Availability Checks”.
Includes an advanced rules engine. The rule engine is used to respond to different events within the cluster and perform the necessary operations to keep the dataservice in optimal working state. During any change in status, whether user-selected or automatically triggered due to a failure, the rules are used to make decisions about whether to restart services, swap Primaries, or reconfigure connectors.
Please see the Tungsten Manager documentation section Chapter 8, Tungsten Manager for more information.
The Tungsten Connector (or Tungsten Proxy) is a service that sits between your application server and your MySQL database. The connector routes connections from your application servers to the datasources within the cluster, automatically distributing and redirecting queries to each datasource according to load balancing and availability requirements.
The primary goal of Tungsten Connector is to effectively route and redirect queries between the Primary and Replica datasources within the cluster. Client applications talk to the connector, while the connector determines where the packets should really go, depending on the scaling and availability. Using a connector in this way effectively hides the complexities of the cluster size and configuration, allowing your cluster to grow and shrink without interrupting your client application connectivity. Client applications remain connected even though the number, configuration and orientation of the Replicas within the cluster may change.
During failover or system maintenance Tungsten Connector takes information from Tungsten Manager to determine which hosts are up and available, and redirects queries only to those servers that are online within the cluster.
For load balancing, Tungsten Connector supports a number of different solutions for redirecting queries to the different datasources within the network. Solutions are either based on explicit routing, or an implied or automatic read/write splitting mode where data is automatically distributed between Primary hosts (writes) and Replica hosts (reads).
Basic read/write splitting uses packet inspection to determine whether a
query is a read operation (SELECT
)
or a write (INSERT
,
UPDATE
,
DELETE
). The actual selection
mechanism can be fine tuned using the different modes according to your
application requirements.
The supported modes are:
Port Based Routing
Port based routing employs a second port on the connector host. All connections to this port are sent to an available Replica.
Direct Reads
Direct reads uses the read/write splitting model, but directs read queries to dedicated read-only connections on the Replica. No attempt is made to determine which host may have the most up to date version of the data. Connections are pooled between the connector and datasources, and this results in very fast execution.
SmartScale
With SmartScale, data is automatically distributed among the datasources using read/write splitting. Where possible, the connector selects read queries by determining how up to date the Replica is, and using a specific session model to determine which host is up to date according to the session and replication status information. Session identification can be through predefined session types or user-defined session strings.
Host Based Routing
Explicit host based routing uses different IP addresses on datasources to identify whether the operation should be directed to a Primary or a Replica. Each connector is configured with two IP addresses, connecting to one IP address triggers the connection to be routed to the current Primary, while connecting to the second IP routes queries to a Replica.
SQL Based Routing
SQL based routing employs packet inspection to identify key strings within the query to determine where the packets should be routed.
These core read/write splitting modes can also be explicitly overridden at a user or host level to allow your application maximum flexibility.
Table of Contents
Creating a Tungsten Clustering (for MySQL) Dataservice using Tungsten Cluster requires careful preparation and configuration of the required components. This section provides guidance on these core operations, preparation and information such as licensing and best practice that should be used for all installations.
Before covering the basics of creating different dataservice types, there are some key terms that will be used throughout the setup and installation process that identify different components of the system. these are summarised in Table 2.1, “Key Terminology”.
Table 2.1. Key Terminology
Tungsten Term | Traditional Term | Description |
---|---|---|
composite dataservice | Multi-Site Cluster | A configured Tungsten Cluster service consisting of multiple dataservices, typically at different physical locations. |
dataservice | Cluster | A configured Tungsten Cluster service consisting of dataservers, datasources and connectors. |
dataserver | Database | The database on a host. Datasources include MySQL, PostgreSQL or Oracle. |
datasource | Host or Node | One member of a dataservice and the associated Tungsten components. |
staging host | - | The machine from which Tungsten Cluster is installed and configured. The machine does not need to be the same as any of the existing hosts in the cluster. |
staging directory | - | The directory where the installation files are located and the installer is executed. Further configuration and updates must be performed from this directory. |
connector | - | A connector is a routing service that provides management for connectivity between application services and the underlying dataserver. |
Witness host | - | A witness host is a host that can be contacted using the ping protocol to act as a network check for the other nodes of the cluster. Witness hosts should be on the same network and segment as the other nodes in the dataservice. |
The manager plays a key role within any dataservice, communicating between the replicator, connector and datasources to understand the current status, and controlling these components to handle failures, maintenance, and service availability.
The primary role of the manager is to monitor each of the services, identify problems, and react to those problems in the most effective way to keep the dataservice active. For example, in the case of a datasource failure, the datasource is temporarily removed from the cluster, the connector is updated to route queries to another available datasource, and the replication is disabled.
These decisions are driven by a rule-based system, which checks current status values, and performs different operations to achieve the correct result and return the dataservice to operational status.
In terms of control and management, the manager is capable of performing backup and restore information, automatically recovering from failure (including re-provisioning from backups), and is also able to individually control the configuration, service startup and shutdown, and overall control of the system.
Within a typical Tungsten Cluster deployment there are multiple managers and these keep in constant contact with each other, and the other services. When a failure occurs, multiple managers are involved in decisions. For example, if a host is no longer visible to one manager, it does not make the decision to disable the service on it's own; only when a majority of managers identify the same result is the decision made. For this reason, there should be an odd number of managers (to prevent deadlock), or managers can be augmented through the use of witness hosts.
One manager is automatically installed for each configured datasource; that is, in a three-node system with a Primary and two Replicas, three managers will be installed.
Checks to determine the availability of hosts are performed by using either the system ping protocol or the Echo TCP/IP protocol on port 7 to determine whether a host is available. The configuration of the protocol to be used can be made by adjusting the manager properties. For more information, see Section B.2.3.3, “Host Availability Checks”.
Connectors (known as routers within the dataservice) provide a routing mechanism between client applications and the dataservice. The Tungsten Connector component automatically routes database operations to the Primary or Replica, and takes account of the current cluster status as communicated to it by the Tungsten Manager. This functionality solves three primary issues that might normally need to be handled by the client application layer:
Datasource role redirection (i.e. Primary and Replica). This includes read/write splitting, and the ability to read data from a Replica that is up to date with a corresponding write.
Datasource failure (high-availability), including the ability to redirect client requests in the event of a failure or failover. This includes maintenance operations.
Dataservice topology changes, for example when expanding the number of datasources within a dataservice
The primary role of the connector is to act as the connection point for applications that can remain open and active, while simultaneously supporting connectivity to the datasources. This allows for changes to the topology and active role of individual datasources without interrupting the client application. Because the operation is through one or more static connectors, the application also does not need to be modified or changed when the number of datasources is expanded or altered.
Depending on the deployment environment and client application requirements, the connector can be installed either on the client application servers, the database servers, or independent hosts. For more information, see Section 7.3, “Clients and Deployment”.
Connectors can also be installed independently on specific hosts. The list
of enabled connectors is defined by the
--connectors
option to
tpm. A Tungsten Cluster dataservice can be installed with
more connector servers than datasources or managers.
Tungsten Replicator provides the core replication of information between datasources and, in composite deployment, between dataservices. The replicator operates by extracting data from the 'Primary' datasource (for example, using the MySQL binary log), and then applies the data to one or more target datasources.
Different deployments use different replicators and configurations, but in a typical Tungsten Cluster deployment a Primary/Replica or active/active deployment model is used. For Tungsten Cluster deployments there will be one replicator instance installed on each datasource host.
Within the dataservice, the manager controls each replicator service and it able to alter the replicator operation and role, for example by switching between Primary and Replica roles. The replicator also provides information to the manager about the latency of the replication operation, and uses this with the connectors to control client connectivity into the dataservice.
Replication within Tungsten Cluster is supported by Tungsten Replicator™ and this supports a wide range of additional deployment topologies, and heterogeneous deployments including MongoDB, Vertica, and Oracle. Replication to and from a dataservice are supported. For more information on replicating out of an existing dataservice, see:
Replicators are automatically configured according to the datasources and topology specified when the dataservice is created.
Tungsten Cluster operates through the rules built into the manager that make decisions about different configuration and status settings for all the services within the cluster. In the event of a communication failure within the system it is vital for the manager, in automatic policy mode, to perform a switch from a failed or unavailable Primary.
Within the network, the managers communicate with each other, in addition to the connectors and dataservers to determine their availability. The managers compare states and network connectivity. In the event of an issue, managers 'vote' on whether a failover or switch should occur.
The rules are designed to prevent unnecessary switches and failovers. Managers vote, and an odd number of managers helps to ensure that prevent split-brain scenarios when invalid failover decisions have been made.
Active Witness — an active witness is an instance of Tungsten Manager running on a host that is otherwise not part of the dataservice. An active witness has full voting rights within the managers and can therefore make informed decisions about the dataservice state in the event of a failure. Active witnesses can only be a member of one cluster at a time.
All managers are active witnesses, and active witnesses are the recommended solution for deployments where network availability is less certain (i.e. cloud environments), and where you have two-node deployments.
Tungsten Cluster Quorum Requirements
There should be at least three managers (including any active witnesses)
There should be, in total, an odd number of managers and witnesses, to prevent deadlocks.
If the dataservice contains only two hosts, at least one active witness must be installed.
These rules apply for all Tungsten Cluster installations and must be adhered to. Deployment will fail if these conditions are not met.
The rules for witness selection are as follows:
Active witnesses can be located beyond or across network segments, but all active witnesses must have clear communication channel to each other, and other managers. Difficulties in contacting other managers and services in the network could cause unwanted failover or shunning of datasources.
To enable active witnesses, the
--enable-active-witnesses=true
option
must be specified and the hosts that will act as active witnesses must be
added to the list of hosts provided to
--members
. This enables all specified
witnesses to be enabled as active witnesses:
shell> ./tools/tpm install alpha --enable-active-witnesses=true \
--witnesses=hostC
\
--members=hostA,hostB,hostC
...
The following Operating Systems are supported for installation of the various Tungsten components and are part of our regular QA testing processes. Other variants of Linux may work at your own risk, but use of them in production should be avoided and any issues that arise may not be supported; if in doubt we recommend that you contact Continuent Support for clarification. Windows/MAC OS is NOT supported, however appropriate Virtual Environments running any of the supported distributions listed would be supported, although only recommended for Development/Testing environments.
Virtual Environments running any of the supported distributions listed are supported, although only recommended for Development/Testing environments.
The list below also includes EOL dates published by the providers and should be taken into consideration when configuring your deployment
Table 2.2. Tungsten OS Support
Distribtion | Published EOL | Notes |
---|---|---|
Amazon Linux 2 | 30th June 2024 | |
Amazon Linux 2023 | ||
CentOS 7 | 30th June 2024 | |
Debian GNU/Linux 10 (Buster) | June 2024 | |
Debian GNU/Linux 11 (Bullseye) | June 2026 | |
Debian 12 | June 2026 | |
Oracle Linux 8.4 | July 2029 | |
Oracle Linux 9 | July 2029 | Version 7.1.3 onwards |
RHEL 7 | 30th June 2024 | |
RHEL 8.4.0 | 31st May 2029 | |
RHEL 9 | ||
Rocky Linux 8 | 31st May 2029 | |
Rocky Linux 9 | 31st May 2032 | |
SUSE Linux Enterprise Server 15 | 21st June 2028 | |
Ubuntu 20.04 LTS (Focal Fossa) | April 2025 | |
Ubuntu 22.04 LTS (Canonical) | April 2027 |
Unless stated, MySQL refers to the following variants:
Table 2.3. MySQL/Tungsten Version Support
Database | MySQL Version | Tungsten Version | Notes |
---|---|---|---|
MySQL | 5.7 | All non-EOL Versions | Full Support |
MySQL | 8.0.0-8.0.34 | 6.1.0-6.1.3 |
Supported, but does not support Partitioned Tables or the use of
binlog-transaction-compression=ON introduced in 8.0.20
|
MySQL | 8.0.0-8.0.34 | 6.1.4 onwards | Fully Supported. |
MariaDB | 10.0, 10.1 | All non-EOL Versions | Full Support |
MariaDB | 10.2, 10.3 | 6.1.13-6.1.20 | Partial Support. See note below. |
MariaDB | Up to, and including, 10.11 | 7.x | Full Support |
Known Issue affecting use of MySQL 8.0.21
In MySQL release 8.0.21 the behavior of
CREATE TABLE ... AS SELECT ...
has changed, resulting in the transactions being logged differenly in the binary log. This change in behavior will cause the replicators to fail.Until a fix is implemented within the replicator, the workaround for this will be to split the action into separate
CREATE TABLE ...
followed byINSERT INTO ... SELECT FROM...
statements.If this is not possible, then you will need to manually create the table on all nodes, and then skip the resulting error in the replicator, allowing the subsequent loading of the data to continue.
MariaDB 10.3+ Support
Full support for MariaDB version 10.3 has been certified in v7.0.0 onwards of the Tungsten products.
Version 6.1.13 onwards of Tungsten will also work, however should you choose to deploy these versions, you do so at your own risk. There are a number of issues noted below that are all resolved from v7.0.0 onwards, therefore if you choose to use an earlier release, you should do so with the following limitations acknowledged:
tungsten_find_orphaned may fail, introducing the risk of data loss (Fixed in v6.1.13 onwards)
SSL from Tungsten Components TO the MariaDB is not supported.
Geometry data type is not supported.
Tungsten backup tools will fail as they rely on xtrabackup, which will not work with newer release of MariaDB.
tpm might fail to find correct mysql configuration file. (Fixed in 6.1.13 onwards)
MariaDB specific event types trigger lots of warnings in the replicator log file.
In 2023, Oracle announced a new MySQL version schema that introduced "Innovation" releases. From this point on, patch releases would only contain bug fixes and these would be, for example, the various 8.0.x releases, whereas new features would only be introduced in the "Innovation" releases, such as 8.1, 8.2 etc (Along with Bug Fixes)
"Innovation" releases will be released quartlery, and Oracle aim to make an LTS release every two years which will bundle all of the new features, behavior changes and bug fixes from all the previous "Innovation" releases.
Oracle do not advise the use of the "Innovation" releases in a production enviornment where a known behavior is expected to ensure system stability. We have chosen to follow this advice and as such we do not certify any release of Tungsten against "Innovation" releases for use in Production. We will naturally test against these releases in our QA environment so that we are able to certify and support the LTS release as soon as is practical. Any modifications needed to support an LTS release will not be backported to older Tungsten releases.
For more information on Oracles release policy, please read their blogpost here
RAM requirements are dependent on the workload being used and applied, but the following provide some guidance on the basic RAM requirements:
Tungsten Replicator requires 2GB of VM space for the Java execution, including the shared libraries, with approximate 1GB of Java VM heapspace. This can be adjusted as required, for example, to handle larger transactions or bigger commit blocks and large packets.
Performance can be improved within the Tungsten Replicator if there is a 2-3GB available in the OS Page Cache. Replicators work best when pages written to replicator log files remain memory-resident for a period of time, so that there is no file system I/O required to read that data back within the replicator. This is the biggest potential point of contention between replicators and DBMS servers.
Tungsten Manager requires approximately 500MB of VM space for execution.
Disk space usage is based on the space used by the core application, the staging directory used for installation, and the space used for the THL files:
The staging directory containing the core installation is
approximately 150MB. When performing a staging-directory based
installation, this space requirement will be used once. When using a
INI-file based deployment, this space will be required on each server.
For more information on the different methods, see
Section 10.1, “Comparing Staging and INI
tpm Methods”.
Deployment of a live installation also requires approximately 150MB.
The THL files required for installation are based on the size of the
binary logs generated by MySQL. THL size is typically twice the size
of the binary log. This space will be required on each machine in the
cluster. The retention times and rotation of THL data can be
controlled, see Section D.1.5, “The thl
Directory” for more
information, including how to change the retention time and move files
during operation.
A dedicated partition for THL and/or Tungsten Software is recommended to ensure that a full disk does not impact your OS or DBMS. Local disk, SAN, iSCSI and AWS EBS are suitable for storing THL. NFS is NOT recommended.
Because the replicator reads and writes information using buffered I/O in a serial fashion, there is no random-access or seeking.
All components of Tungsten are certified with Java using the following versions:
Oracle JRE 8
Oracle JRE 11 (From release 6.1.2 only)
OpenJDK 8
OpenJDK 11 (From release 6.1.2 only)
Java 9, 10 and 13 have been tested and validated but certification and support will only cover Long Term releases.
There are a number of known issues in earlier Java revisions that may cause performance degradation, high CPU, and/or component hangs, specifically when SSL is enabled. It is strongly advised that you ensure your Java version is one of the following MINIMUM releases:
All versions from 8u265, excluding version 13 onwards, contain a bug that can trigger unusually high CPU and/or
system timeouts and hangs within the SSL protocol. To avoid this, you should add the following entry to the wrapper.conf
file for all relevant components. This will be included by default from version 6.1.15 onwards of all Tungsten products.
wrapper.conf
can be found in the following path {INSTALLDIR}
/tungsten/tungsten-component
/conf,
for example: /opt/continuent/tungsten/tungsten-manager/conf
wrapper.java.additional.{next available number}
=-Djdk.tls.acknowledgeCloseNotify=true
For example:
wrapper.java.additional.16=-Djdk.tls.acknowledgeCloseNotify=true
After editing the file, each component will need restarting
If your original installation was performed with Java 8 installed, and you wish to upgrade to Java 11, you will need to issue tools/tpm update --replace-release on all nodes from within the software staging path.
This is to allow the components to detect the newer Java version and adjust to avoid calls to functions that were deprecated/renamed between version 8 and version 11.
Cloud deployments require a different set of considerations over and above the general requirements. The following is a guide only, and where specific cloud environment requirements are known, they are explicitly included:
Instance Types/Configuration
Attribute | Guidance | Amazon Example |
---|---|---|
Instance Type | Instance sizes and types are dependent on the workload, but larger instances are recommended for transactional databases. |
m4.xlarge or better
|
Instance Boot Volume | Use block, not ephemeral storage. | EBS |
Instance Deployment | Use standard Linux distributions and bases. For ease of deployment and configuration, the use of Ansible, Puppet or other script based solutions could be used. | Amazon Linux AMIs |
Development/QA nodes should always match the expected production environment.
AWS/EC2 Deployments
Use Virtual Private Cloud (VPC) deployments, as these provide consistent IP address support.
When using Active Witnesses, a
micro
instance can be used for a
single cluster. For composite clusters, an instance size larger than
micro
must be used.
Multiple EBS-optimized volumes for data, using Provisioned IOPS for the EBS volumes depending on workload:
Parameter | tpm Option | tpm Value |
MySQL my.cnf Option
| MySQL Value |
---|---|---|---|---|
/ (root)
| ||||
MySQL Data |
datasource-mysql-data-directory
|
/volumes/mysql/data
|
datadir
|
/volumes/mysql/data
|
MySQL Binary Logs |
datasource-log-directory
|
/volumes/mysql/binlogs
|
log-bin
|
/volumes/mysql/binlogs/mysql-bin
|
Transaction History Logs (THL) |
thl-directory
|
/volumes/mysql/thl
|
Recommended Replication Formats
MIXED
is recommended for MySQL
Primary/Replica topologies (e.g., either single clusters or
primary/data-recovery setups).
ROW
is strongly recommended
for Composite Active/Active setups. Without ROW
,
data drift is a possible problem
when using MIXED
or
STATEMENT
. Even with
ROW
there are still cases
where drift is possible but the window is far smaller.
Continuent has traditionally had a relaxed policy about Linux platform support for customers using our products.
While it is possible to install and run Continuent Tungsten products (i.e. Clustering/Replicator/etc.) inside Docker containers, there are many reasons why this is not a good idea.
As background, every database node in a Tungsten Cluster runs at least three (3) layers or services:
MySQL Server (i.e. MySQL Community or Enterprise, MariaDB or Percona Server)
Tungsten Manager, which handles health-checking, signaling and failover decisions (Java-based)
Tungsten Replicator, which handles the movement of events from the MySQL Primary server binary logs to the Replica databases nodes (Java-based)
Optionally, a fourth service, the Tungsten Connector (Java-based), may be installed as well, and often is.
As such, this means that the Docker container would also need to support these 3 or 4 layers and all the resources needed to run them.
This is not what containers were designed to do. In a proper containerized architecture, each container would contain one single layer of the operation, so there would be 3-4 containers per “node”. This sort of architecture is best managed by some underlying technology like Swarm, Kubernetes, or Mesos.
More reasons to avoid using Docker containers with Continuent Tungsten solutions:
Our product is designed to run on a full Linux OS. By design Docker does not have a full init system like SystemD, SysV init, Upstart, etc… This means that if we have a process (Replicator, Manager, Connector, etc…) that process will run as PID 1. If this process dies the container will die. There are some solutions that let a Docker container to have a ‘full init’ system so the container can start more processes like ssh, replicator, manager, … all at once. However this is almost a heavyweight VM kind of behavior, and Docker wasn’t designed this way.
Requires a mutable container – to use Tungsten Clustering inside a Docker container, the Docker container must be launched as a mutable Linux instance, which is not the classic, nor proper way to use containers.
Our services are not designed as “serverless”. Serverless containers are totally stateless. Tungsten Cluster and Tungsten Replicator do not support this type of operation.
Until we make the necessary changes to our software, using Docker as a cluster node results in a minimum 1.2GB docker image.
Once Tungsten Cluster and Tungsten Replicator have been refactored using a microservices-based architecture, it will be much easier to scale our solution using containers.
A Docker container would need to allow for updates in order for the Tungsten Cluster and Tungsten Replicator software to be re-configured as needed. Otherwise, a new Docker container would need to be launched every time a config change was required.
There are known i/o and resource constraints for Docker containers, and therefore must be carefully deployed to avoid those pitfalls.
We test on CentOS-derived Linux platforms.
Tungsten Cluster is available in a number of different distribution types, and
the methods for configuration available for these different packages
differs. See Section 10.1, “Comparing Staging and INI
tpm Methods” for more
information on the available installation methods.
Deployment Type/Package | TAR/GZip | RPM |
---|---|---|
Staging Installation | Yes | No |
INI File Configuration | Yes | Yes |
Deploy Entire Cluster | Yes | No |
Deploy Per Machine | Yes | Yes |
Two primary deployment sources are available:
Using the TAR/GZip package creates a local directory that enables you to perform installs and updates from the extracted 'staging' directory, or use the INI file format.
Using the RPM package format is more suited to using the INI file format, as hosts can be installed and upgraded to the latest RPM package independently of each other.
All packages are named according to the product, version number, build release and extension. For example:
tungsten-clustering-7.1.4-10.tar.gz
The version number is
7.1.4
and build
number 10
. Build
numbers indicate which build a particular release version is based on, and
may be useful when installing patches provided by support.
To use the TAR/GZipped packages, download the files to your machine and unpack them:
shell>cd /opt/continuent/software
shell>tar zxf tungsten-clustering-7.1.4-10.tar.gz
This will create a directory matching the downloaded package name,
version, and build number from which you can perform an install using
either the INI file or command-line configuration. To use, you will need
to use the tpm command within the
tools
directory of the extracted package:
shell> cd tungsten-clustering-7.1.4-10
The RPM packages can be used for installation, but are primarily designed to be in combination with the INI configuration file.
Installation
Installing the RPM package will do the following:
Create the tungsten
system user
if it doesn't exist
Make the tungsten
system user
part of the mysql
group if it
exists
Create the
/opt/continuent/software
directory
Unpack the software into
/opt/continuent/software
Define the $CONTINUENT_PROFILES
and
$REPLICATOR_PROFILES
environment variables
Update the profile script to include the
/opt/continuent/share/env.sh
script
Create the /etc/tungsten
directory
Run tpm install if the
/etc/tungsten.ini
or
/etc/tungsten/tungsten.ini
file exists
Although the RPM packages complete a number of the pre-requisite steps required to configure your cluster, there are additional steps, such as configuring ssh, that you still need to complete. For more information, see Appendix B, Prerequisites.
By using the package files you are able to setup a new server by creating
the /etc/tungsten.ini
file and then installing the
package. Any output from the tpm command will go to
/opt/continuent/service_logs/rpm.output
.
If you download the package files directly, you may need to add the signing key to your environment before the package will load properly.
For yum platforms (RHEL/CentOS/Amazon Linux), the rpm command is used :
root-shell> rpm --import http://www.continuent.com/RPM-GPG-KEY-continuent
For Ubuntu/Debian platforms, the gpg command is used :
root-shell> gpg --keyserver keyserver.ubuntu.com --recv-key 7206c924
Upgrades
If you upgrade to a new version of the RPM package it will do the following:
Unpack the software into
/opt/continuent/software
Run tpm update if the
/etc/tungsten.ini
or
/etc/tungsten/tungsten.ini
file exists
The tpm update will restart all Continuent Tungsten services so you do not need to do anything after upgrading the package file.
There are a variety of tpm options that can be used to alter some aspect of the deployment during configuration. Although they might not be provided within the example deployments, they may be used or required for different installation environments. These include options such as altering the ports used by different components, or the commands and utilities used to monitor or manage the installation once deployment has been completed. Some of the most common options are included within this section.
Changes to the configuration should be made with tpm update. This continues the procedure of using tpm install during installation. See Section 10.5.27, “tpm update Command” for more information on using tpm update.
--datasource-systemctl-service
On some platforms and environments the command used to manage and control the MySQL or MariaDB service is handled by a tool other than the services or /etc/init.d/mysql commands.
Depending on the system or environment other commands using the same
basic structure may be used. For example, within CentOS 7, the command
is systemctl. You can explicitly
set the command to be used by using the
--datasource-systemctl-service
to
specify the name of the tool.
The format of the corresponding command that will be used is expected to follow the same format as previous commands, for example to start the database service::
shell> systemctl mysql stop
Different commands must follow the same basic structure, the command
configured by
--datasource-systemctl-service
, the
servicename, and the status (i.e.
stop
).
A successful deployment depends on being mindful during deployment, operations and ongoing maintenance.
Identify the best deployment method for your environment and use that
in production and testing. See
Section 10.1, “Comparing Staging and INI
tpm Methods”.
Standardize the OS and database prerequisites. There are Ansible modules available for immediate use within AWS, or as a template for modifications.
More information on the Ansible method is available in this blog article.
Ensure that the output of the `hostname` command and the nodename entries in the Tungsten configuration match exactly prior to installing Tungsten.
The configuration keys that define nodenames are: --slaves
, --dataservice-slaves
, --members
, --master
, --dataservice-master-host
, --masters
and --relay
For security purposes you should ensure that you secure the following areas of your deployment:
Ensure that you create a unique installation and deployment user, such as tungsten, and set the correct file permissions on installed directories. See Section B.2.4, “Directory Locations and Configuration”.
When using ssh and/or SSL, ensure that the ssh key or certificates are suitably protected. See Section B.2.3.2, “SSH Configuration”.
Use a firewall, such as iptables to protect the network ports that you need to use. The best solution is to ensure that only known hosts can connect to the required ports for Tungsten Cluster. For more information on the network ports required for Tungsten Cluster operation, see Section B.2.3.1, “Network Ports”.
If possible, use authentication and SSL connectivity between hosts to protext your data and authorisation for the tools used in your deployment.
See Chapter 5, Deployment: Security for more information.
Choose your topology from the deployment section and verify the configuration matches the basic settings. Additional settings may be included for custom features but the basics are needed to ensure proper operation. If your configuration is not listed or does not match our documented settings; we cannot guarantee correct operation.
If there are an even number of database servers in the cluster, configure the cluster with a witness host. An active witness is preferred but a passive one will ensure stability. See Section 2.1.4, “Active Witness Hosts” for an explanation of the differences and how to configure them.
If you are using ROW
replication, any triggers that run additional
INSERT
/UPDATE
/DELETE
operations must be updated so they do not run on the Replica servers.
Make sure you know the structure of the Tungsten Cluster home directory and how to initialize your environment for administration. See Section 6.1, “The Home Directory” and Section 6.2, “Establishing the Shell Environment”.
Prior to migrating applications to Tungsten Cluster test failover and recovery procedures from Chapter 6, Operations Guide. Be sure to try recovering a failed Primary and reprovisioning failed Replicas.
When deciding on the Service Name for your configurations, keep them simple and short and only use alphanumerics (Aa-Zz,0-9) and underscores (_).
In this section we identify the best practices for performing a Tungsten Software upgrade.
Identify the deployment method chosen for your environment, Staging or
INI. See Section 10.1, “Comparing Staging and INI
tpm Methods”.
The best practice for Tungsten software is to upgrade All-at-Once, performing zero Primary switches.
The Staging deployment method automatically does an All-at-Once upgrade - this is the basic design of the Staging method.
For an INI upgrade, there are two possible ways, One-at-a-Time (with at least one Primary switch), and All-at-Once (no switches at all).
See Section 10.4.3, “Upgrades with an INI File” for more information.
Here is the sequence of events for a proper Tungsten upgrade on a 3-node cluster with the INI deployment method:
Login to the Customer Downloads Portal and get the latest version of the software.
Copy the file (i.e.
tungsten-clustering-7.0.2-161.tar.gz
) to each
host that runs a Tungsten component.
Set the cluster to policy MAINTENANCE
On every host:
Extract the tarball under /opt/continuent/software/ (i.e.
create
/opt/continuent/software/tungsten-clustering-7.0.2-161
)
cd to the newly extracted directory
Run the Tungsten Package Manager tool, tools/tpm update --replace-release
For example, here are the steps in order:
On ONE database node: shell>cctrl
cctrl>set policy maintenance
cctrl>exit
On EVERY Tungsten host at the same time: shell>cd /opt/continuent/software
shell>tar xvzf tungsten-clustering-7.0.2-161.tar.gz
shell>cd tungsten-clustering-7.0.2-161
To perform the upgrade and restart the Connectors gracefully at the same time: shell>tools/tpm update --replace-release
To perform the upgrade and delay the restart of the Connectors to a later time: shell>tools/tpm update --replace-release --no-connectors
When it is time for the Connector to be promoted to the new version, perhaps after taking it out of the load balancer: shell>tpm promote-connector
When all nodes are done, on ONE database node: shell>cctrl
cctrl>set policy automatic
cctrl>exit
WHY is it ok to upgrade and restart everything all at once?
Let’s look at each component to examine what happens during the upgrade, starting with the Manager layer.
Once the cluster is in Maintenance mode, the Managers cease to make changes to the cluster, and therefore Connectors will not reroute traffic either.
Since Manager control of the cluster is passive in Maintenance mode, it is safe to stop and restart all Managers - there will be zero impact to the cluster operations.
The Replicators function independently of client MySQL requests (which come through the Connectors and go to the MySQL database server), so even if the Replicators are stopped and restarted, there should be only a small window of delay while the replicas catch up with the Primary once upgraded. If the Connectors are reading from the Replicas, they may briefly get stale data if not using SmartScale.
Finally, when the Connectors are upgraded they must be restarted so the new version can take over. As discussed in this blog post, Zero-Downtime Upgrades, the Tungsten Cluster software upgrade process will do two key things to help keep traffic flowing during the Connector upgrade promote step:
Execute `connector graceful-stop 30` to gracefully drain existing connections and prevent new connections.
Using the new software version, initiate the start/retry feature which launches a new connector process while another one is still bound to the server socket. The new Connector process will wait for the socket to become available by retrying binding every 200ms by default (which is tunable), drastically reducing the window for application connection failures.
Setup proper monitoring for all servers as described in Section 6.17, “Monitoring Tungsten Cluster”.
Configure the Tungsten Cluster services to startup and shutdown along with the server. See Section 4.4, “Configuring Startup on Boot”.
Schedule the Section 9.8, “The cluster_backup Command” tool on each database server at least each night. The script will take a backup of at least one server. Skip this step if you have another backup method scheduled that takes consistent snapshots of your server.
Your license allows for a testing cluster. Deploy a cluster that matches your production cluster and test all operations and maintenance operations there.
Schedule regular tests for local and DR failover. This should at least include switching the Primary server to another host in the local cluster. If possible, the DR cluster should be tested once per quarter.
Disable any automatic operating system patching processes. The use of automatic patching will cause issues when all database servers automatically restart without coordination. See Section 6.15.3, “Performing Maintenance on an Entire Dataservice”.
Regularly check for maintenance releases and upgrade your environment. Every version includes stability and usability fixes to ease the administrative process.
Table of Contents
Creating a Tungsten Clustering (for MySQL) Dataservice using Tungsten Cluster combines a number of different components, systems, and functionality, to support a running database dataservice that is capable of handling database failures, complex replication topologies, and management of the client/database connection for both load balancing and failover scenarios.
How you choose to deploy depends on your requirements and environment. All deployments operate through the tpm command. tpm operates in two different modes:
tpm staging configuration — a tpm configuration is created by defining the command-line arguments that define the deployment type, structure and any additional parameters. tpm then installs all the software on all the required hosts by using ssh to distribute Tungsten Cluster and the configuration, and optionally automatically starts the services on each host. tpm manages the entire deployment, configuration and upgrade procedure.
tpm INI
configuration — tpm uses an
INI
to configure the service on the
local host. The INI
file must be
create on each host that will be part of the cluster.
tpm only manages the services on the local host; in a
multi-host deployment, upgrades, updates, and configuration must be
handled separately on each host.
The following sections provide guidance and instructions for creating a number of different deployment scenarios using Tungsten Cluster.
Within a Primary/Replica service, there is a single Primary which replicates data to the Replicas. The Tungsten Connector handles connectivity by the application and distributes the load to the datasources in the dataservice.
Before continuing with deployment you will need the following:
The name to use for the cluster.
The list of datasources in the cluster. These are the servers which will be running MySQL.
The list of servers that will run the connector.
The username and password of the MySQL replication user.
The username and password of the first application user. You may add more users after installation.
All servers must be prepared with the proper prerequisites. See Appendix B, Prerequisites for additional details.
Install the Tungsten Cluster package or download the Tungsten Cluster tarball, and unpack it:
shell>cd /opt/continuent/software
shell>tar zxf
tungsten-clustering-7.1.4-10.tar.gz
Change to the Tungsten Cluster directory:
shell> cd tungsten-clustering-7.1.4-10
Run tpm to perform the installation, using either the staging
method or the INI method. Review Section 10.1, “Comparing Staging and INI
tpm Methods”
for more details on these two methods.
Click the link below to switch examples between Staging and INI methods
shell>./tools/tpm configure defaults \ --reset \ --user=tungsten \ --install-directory=/opt/continuent \ --profile-script=~/.bash_profile \ --replication-user=tungsten \ --replication-password=password \ --replication-port=13306 \ --application-user=app_user \ --application-password=secret \ --application-port=3306 \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure alpha \ --topology=clustered \ --master=host1 \ --members=host1,host2,host3 \ --connectors=host4
shell> vi /etc/tungsten/tungsten.ini
[defaults] user=tungsten install-directory=/opt/continuent profile-script=~/.bash_profile replication-user=tungsten replication-password=password replication-port=13306 application-user=app_user application-password=secret application-port=3306 rest-api-admin-user=apiuser rest-api-admin-pass=secret
[alpha] topology=clustered master=host1 members=host1,host2,host3 connectors=host4
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
For staging configurations, deletes all pre-existing configuration information between updating with the new configuration values.
System User
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
--profile-script=~/.bash_profile
profile-script=~/.bash_profile
Append commands to include env.sh in this profile script
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
--replication-password=password
The password to be used when connecting to the database using
the corresponding
--replication-user
.
The network port used to connect to the database server. The default port used depends on the database being configured.
Database username for the connector
Database password for the connector
Port for the connector to listen on
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
Hostnames for the dataservice connectors
If you plan to make full use of the REST API (which is enabled by default) you will need to also configure a username and password for API Access. This must be done by specifying the following options in your configuration:
rest-api-admin-user=tungsten rest-api-admin-pass=secret
For more information on using and configuring the REST API, see Section 11.1, “Getting Started with Tungsten REST API”
Run tpm to install the software with the configuration.
shell > ./tools/tpm install
During the startup and installation, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
Initialize your PATH
and environment.
shell > source /opt/continuent/share/env.sh
Do not include start-and-report
if you are taking over for MySQL native replication. See
Section 3.11.1, “Migrating from MySQL Native Replication 'In-Place'” for next steps
after completing installation.
Follow the guidelines in Section 2.5, “Best Practices”.
Tungsten Cluster supports the creation of composite clusters. This includes multiple active/passive dataservices tied together. One of the dataservices is identified as the active, containing the Primary node and all other dataservices (passive) replicate from it.
Before continuing with deployment you will need the following:
The cluster name for each Active/Passive Cluster and a Composite cluster name to group them.
The list of datasources in each cluster. These are the servers which will be running MySQL.
The list of servers that will run the connector. Each connector will be associated with a preferred cluster but will have access to the Primary regardless of location.
The username and password of the MySQL replication user.
The username and password of the first application user. You may add more users after installation.
All servers must be prepared with the proper prerequisites. See Appendix B, Prerequisites for additional details.
Install the Tungsten Cluster package or download the Tungsten Cluster tarball, and unpack it:
shell>cd /opt/continuent/software
shell>tar zxf
tungsten-clustering-7.1.4-10.tar.gz
Change to the Tungsten Cluster directory:
shell> cd tungsten-clustering-7.1.4-10
Run tpm to perform the installation. This method assumes you are using the Section 10.3, “tpm Staging Configuration” method:
Click the link below to switch examples between Staging and INI methods
shell>./tools/tpm configure defaults \ --reset \ --user=tungsten \ --install-directory=/opt/continuent \ --profile-script=~/.bash_profile \ --replication-user=tungsten \ --replication-password=secret \ --replication-port=13306 \ --application-user=app_user \ --application-password=secret \ --application-port=3306 \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure alpha \ --topology=clustered \ --master=host1.alpha \ --members=host1.alpha,host2.alpha,host3.alpha \ --connectors=host1.alpha,host2.alpha,host3.alpha
shell>./tools/tpm configure beta \ --topology=clustered \ --relay=host1.beta \ --members=host1.beta,host2.beta,host3.beta \ --connectors=host1.beta,host2.beta,host3.beta \ --relay-source=alpha
shell>./tools/tpm configure gamma \ --composite-datasources=alpha,beta
shell> vi /etc/tungsten/tungsten.ini
[defaults] user=tungsten install-directory=/opt/continuent profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 application-user=app_user application-password=secret application-port=3306 rest-api-admin-user=apiuser rest-api-admin-pass=secret
[alpha] topology=clustered master=host1.alpha members=host1.alpha,host2.alpha,host3.alpha connectors=host1.alpha,host2.alpha,host3.alpha
[beta] topology=clustered relay=host1.beta members=host1.beta,host2.beta,host3.beta connectors=host1.beta,host2.beta,host3.beta relay-source=alpha
[gamma] composite-datasources=alpha,beta
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
For staging configurations, deletes all pre-existing configuration information between updating with the new configuration values.
System User
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
--profile-script=~/.bash_profile
profile-script=~/.bash_profile
Append commands to include env.sh in this profile script
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
The password to be used when connecting to the database using
the corresponding
--replication-user
.
The network port used to connect to the database server. The default port used depends on the database being configured.
Database username for the connector
Database password for the connector
Port for the connector to listen on
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
The hostname of the primary (extractor) within the current service.
--members=host1.alpha,host2.alpha,host3.alpha
members=host1.alpha,host2.alpha,host3.alpha
Hostnames for the dataservice members
--connectors=host1.alpha,host2.alpha,host3.alpha
connectors=host1.alpha,host2.alpha,host3.alpha
Hostnames for the dataservice connectors
Configuration group beta
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
The hostname of the primary (extractor) within the current service.
--members=host1.beta,host2.beta,host3.beta
members=host1.beta,host2.beta,host3.beta
Hostnames for the dataservice members
--connectors=host1.beta,host2.beta,host3.beta
connectors=host1.beta,host2.beta,host3.beta
Hostnames for the dataservice connectors
Dataservice name to use as a relay source
Configuration group gamma
The description of each of the options is shown below; click the icon to hide this detail:
--composite-datasources=alpha,beta
composite-datasources=alpha,beta
Data services that should be added to this composite data service
If you plan to make full use of the REST API (which is enabled by default) you will need to also configure a username and password for API Access. This must be done by specifying the following options in your configuration:
rest-api-admin-user=tungsten rest-api-admin-pass=secret
Run tpm to install the software with the configuration.
shell > ./tools/tpm install
During the startup and installation, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
Initialize your PATH
and environment.
shell > source /opt/continuent/share/env.sh
The Composite Active/Passive Cluster should be installed and ready to use.
Follow the guidelines in Section 2.5, “Best Practices”.
Adding an entire new cluster provides significant level of availability and capacity. The new cluster nodes that form the cluster will be fully aware of the original cluster(s) and communicate with existing managers and datasources within the cluster.
The following steps guide you through updating the configuration to include the new hosts and services you are adding.
On the new host(s), ensure the Appendix B, Prerequisites have been followed.
Let's assume that we have a composite cluster dataservice called
global
with two clusters,
east
and
west
, with three nodes each.
In this worked exmple, we show how to add an additional cluster service
called north
with three new nodes.
Set the cluster to maintenance mode using cctrl:
shell>cctrl
[LOGICAL] / >use global
[LOGICAL] /global >set policy maintenance
Using the following as an example, update the configuration to include the new cluster and update the additional composite service block. If using an INI installation copy the ini file to all the new nodes in the new cluster.
Click the link to switch between staging or ini examples
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm configure north \ --connectors=db7,db8,db9 \ --relay-source=east \ --relay=db7 \ --slaves=db8,db9 \ --topology=clustered
shell>./tools/tpm configure global \ --composite-datasources=east,west,north
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update --no-connectors --replace-release
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
shell> vi /etc/tungsten/tungsten.ini
[north] ... connectors=db7,db8,db9 relay-source=east relay=db7 slaves=db8,db9 topology=clustered
[global] ... composite-datasources=east,west,north
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update --no-connectors --replace-release
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
Using the --no-connectors
option
updates the current deployment without restarting the existing
connectors.
If installed via INI, on all nodes in the new cluster, download and unpack the software, and install
shell>cd /opt/continunent/software
shell>tar zxvf tungsten-clustering-7.1.4-10.tar.gz
shell>cd /opt/continuent/software/tungsten-clustering-7.1.4-10
shell>tools/tpm install
On every node in the original clusters, make sure all replicators are online:
shell> trepctl online; trepctl services
On all nodes in the new cluster start the software
shell> startall
The next steps will involve provisioning the new cluster nodes. An alternative approach to using this method would be to take a backup of a Replica from the existing cluster, and manually restoring it to ALL nodes in the new cluster PRIOR to issuing the install step above. If you take this approach then you can skip the next two re-provision steps.
Go to the relay (Primary) node of the new cluster (i.e. db7) and provision it from any Replica in the original cluster (i.e. db2):
shell> tungsten_provision_slave --source db2
Go to a Replica node of the new cluster (i.e. db8) and provision it from the relay node of the new cluster (i.e. db7):
shell> tungsten_provision_slave --source db7
Repeat the process for the renamining Replicas nodes in the new cluster.
Set the composite cluster to automatic mode using cctrl:
shell>cctrl
[LOGICAL] / >use global
[LOGICAL] /global >set policy automatic
During a period when it is safe to restart the connectors:
shell> ./tools/tpm promote-connector
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering using v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
A Multi-Site/Active-Active topology provides all the benefits of a typical dataservice at a single location, but with the benefit of also replicating the information to another site. The underlying configuration within Tungsten Cluster uses the Tungsten Replicator which enables operation between the two sites.
The configuration is in two separate parts:
Tungsten Cluster dataservice that operates the main dataservice service within each site.
Tungsten Replicator dataservice that provides replication between the two sites; one to replicate from site1 to site2, and one for site2 to site1.
A sample display of how this operates is provided in Figure 3.3, “Topologies: Multi-Site/Active-Active Clusters”.
The service can be described as follows:
Tungsten Cluster Service: east
Replicates data between east1, east2 and east3 (not shown).
Tungsten Cluster Service: west
Replicates data between west1, west2 and west3 (not shown).
Tungsten Replicator Service: east
Defines the replication of data within east as a replicator service
using Tungsten Replicator. This service reads from all the hosts within
the Tungsten Cluster service east
and writes to west1
,
west2
, and
west3
. The service name is the
same to ensure that we do not duplicate writes from the clustered
service already running.
Data is read from the east
Tungsten Cluster and replicated to the
west
Tungsten Cluster dataservice.
The configuration allows for changes in the Tungsten Cluster dataservice
(such as a switch or failover) without upsetting the site-to-site
replication.
Tungsten Replicator Service: west
Defines the replication of data within west as a replicator service
using Tungsten Replicator. This service reads from all the hosts within
the Tungsten Cluster service west
and writes to east1
,
east2
, and
east3
. The service name is the
same to ensure that we do not duplicate writes from the clustered
service already running.
Data is read from the west
Tungsten Cluster and replicated to the
east
Tungsten Cluster dataservice.
The configuration allows for changes in the Tungsten Cluster dataservice
(such as a switch or failover) without upsetting the site-to-site
replication.
Tungsten Replicator Service: east_west
Replicates data from East to West, using Tungsten Replicator. This is a service alias that defines the reading from the dataservice (as an trext;) to other servers within the destination cluster.
Tungsten Replicator Service: west_east
Replicates data from West to East, using Tungsten Replicator. This is a service alias that defines the reading from the dataservice (as an trext;) to other servers within the destination cluster.
Requirements. Recommended releases for Multi-Site/Active-Active deployments are Tungsten Cluster 5.4.x and Tungsten Replicator 5.4.x however this topology can also be installed with the later v6+ releases.
Some considerations must be taken into account for any active/active scenario:
For tables that use auto-increment, collisions are possible if two
hosts select the same
auto-increment
number. You can
reduce the effects by configuring each MySQL host with a different
auto-increment settings, changing the offset and the increment values.
For example, adding the following lines to your
my.cnf
file:
auto-increment-offset = 1
auto-increment-increment = 4
In this way, the increments can be staggered on each machine and collisions are unlikely to occur.
Use row-based replication. Update your configuration file to
explicitly use row-based replication by adding the following to your
my.cnf
file:
binlog-format = row
Beware of triggers. Triggers can cause problems during replication because if they are applied on the Replica as well as the Primary you can get data corruption and invalid data. Tungsten Cluster cannot prevent triggers from executing on a Replica, and in an active/active topology there is no sensible way to disable triggers. Instead, check at the trigger level whether you are executing on a Primary or Replica. For more information, see Section C.4.1, “Triggers”.
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering using v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
Creating the configuration requires two distinct steps, the first to create the two Tungsten Cluster deployments, and a second that creates the Tungsten Replicator configurations on different network ports, and different install directories.
Install the Tungsten Cluster and Tungsten Replicator packages or download the tarballs, and unpack them:
shell>cd /opt/continuent/software
shell>tar zxf
shell>tungsten-clustering-7.1.4-10.tar.gz
tar zxf
tungsten-replicator-7.1.4-10.tar.gz
Change to the Tungsten Cluster directory:
shell> cd tungsten-clustering-7.1.4-10
Run tpm to configure the installation. This method assumes you are using the Section 10.3, “tpm Staging Configuration” method:
Click the link below to switch examples between Staging and INI methods
For ini install, the ini file contains all the configuration for both the cluster deployment and the replicator deployment.
For a staging install, you first use the cluster configuration show below and then configure the replicator as a separate process. These additional steps are outlined below
shell>./tools/tpm configure defaults \ --reset \ --user=tungsten \ --install-directory=/opt/continuent \ --replication-user=tungsten \ --replication-password=secret \ --replication-port=3306 \ --profile-script=~/.bashrc \ --application-user=app_user \ --application-password=secret \ --skip-validation-check=MySQLPermissionsCheck \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure east \ --topology=clustered \ --connectors=east1,east2,east3 \ --master=east1 \ --members=east1,east2,east3
shell>./tools/tpm configure west \ --topology=clustered \ --connectors=west1,west2,west3 \ --master=west1 \ --members=west1,west2,west3
shell> vi /etc/tungsten/tungsten.ini
[defaults] user=tungsten install-directory=/opt/continuent replication-user=tungsten replication-password=secret replication-port=3306 profile-script=~/.bashrc application-user=app_user application-password=secret skip-validation-check=MySQLPermissionsCheck rest-api-admin-user=apiuser rest-api-admin-pass=secret
[defaults.replicator] home-directory=/opt/replicator rmi-port=10002 executable-prefix=mm
[east] topology=clustered connectors=east1,east2,east3 master=east1 members=east1,east2,east3
[west] topology=clustered connectors=west1,west2,west3 master=west1 members=west1,west2,west3
[east_west] topology=cluster-slave master-dataservice=east slave-dataservice=west thl-port=2113
[west_east] topology=cluster-slave master-dataservice=west slave-dataservice=east thl-port=2115
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
For staging configurations, deletes all pre-existing configuration information between updating with the new configuration values.
System User
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
The password to be used when connecting to the database using
the corresponding
--replication-user
.
The network port used to connect to the database server. The default port used depends on the database being configured.
Append commands to include env.sh in this profile script
Database username for the connector
Database password for the connector
--skip-validation-check=MySQLPermissionsCheck
skip-validation-check=MySQLPermissionsCheck
The --skip-validation-check
disables a given validation check. If any validation check
fails, the installation, validation or configuration will
automatically stop.
Using this option enables you to bypass the specified check, although skipping a check may lead to an invalid or non-working configuration.
You can identify a given check if an error or warning has been raised during configuration. For example, the default table type check:
... ERROR >> centos >> The datasource root@centos:3306 (WITH PASSWORD) » uses MyISAM as the default storage engine (MySQLDefaultTableTypeCheck) ...
The check in this case is
MySQLDefaultTableTypeCheck
,
and could be ignored using
--skip-validation-check=MySQLDefaultTableTypeCheck
.
Setting both
--skip-validation-check
and
--enable-validation-check
is
equivalent to explicitly disabling the specified check.
Configuration group defaults.replicator
The description of each of the options is shown below; click the icon to hide this detail:
--home-directory=/opt/replicator
home-directory=/opt/replicator
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
Replication RMI listen port
When enabled, the supplied prefix is added to each command alias
that is generated for a given installation. This enables
multiple installations to co-exist and and be accessible through
a unique alias. For example, if the executable prefix is
configured as east
, then
an alias for the installation to trepctl will
be created as east_trepctl.
Alias information for executable prefix data is stored within
the
$CONTINUENT_ROOT/share/aliases.sh
file for each installation.
Configuration group east
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
--connectors=east1,east2,east3
Hostnames for the dataservice connectors
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
Configuration group west
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
--connectors=west1,west2,west3
Hostnames for the dataservice connectors
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
Configuration group east_west
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
Dataservice name to use as a relay source
Dataservice to use to determine the value of host configuration
Port to use for THL Operations
Configuration group west_east
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
Dataservice name to use as a relay source
Dataservice to use to determine the value of host configuration
Port to use for THL Operations
If you plan to make full use of the REST API (which is enabled by default) you will need to also configure a username and password for API Access. This must be done by specifying the following options in your configuration:
rest-api-admin-user=tungsten rest-api-admin-pass=secret
Run tpm to install the software with the configuration.
shell > ./tools/tpm install
During the startup and installation, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
Change to the Tungsten Replicator directory:
shell> cd tungsten-replicator-7.1.4-10
Run tpm to configure the installation. This method assumes you are using the Section 10.3, “tpm Staging Configuration” method:
If you are running a staging install, first configure the replicator using the following example, if configuring using an ini file, skip straight to the install step below
shell>./tools/tpm configure defaults \ --reset \ --user=tungsten \ --install-directory=/opt/replicator \ --replication-user=tungsten \ --replication-password=secret \ --replication-port=3306 \ --profile-script=~/.bashrc \ --application-user=app_user \ --application-password=secret \ --skip-validation-check=MySQLPermissionsCheck \ --rmi-port=10002 \ --executable-prefix=mm \ --thl-port=2113 \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure east \ --topology=clustered \ --connectors=east1,east2,east3 \ --master=east1 \ --members=east1,east2,east3
shell>./tools/tpm configure west \ --topology=clustered \ --connectors=west1,west2,west3 \ --master=west1 \ --members=west1,west2,west3
shell>./tools/tpm configure east_west \ --topology=cluster-slave \ --master-dataservice=east \ --slave-dataservice=west \ --thl-port=2113
shell>./tools/tpm configure west_east \ --topology=cluster-slave \ --master-dataservice=west \ --slave-dataservice=east \ --thl-port=2115
Run tpm to install the software with the configuration.
shell > ./tools/tpm install
During the startup and installation, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
Initialize your PATH
and environment.
shell>source /opt/continuent/share/env.sh
shell>source /opt/replicator/share/env.sh
The Multi-Site/Active-Active clustering should be installed and ready to use.
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
In addition to this information, follow the guidelines in Section 2.5, “Best Practices”.
Running a Multi-Site/Active-Active service uses many different components
to keep data updated on all servers. Monitoring the dataservice is
divided into monitoring the two different clusters. Be mindful when
using commands that you have the correct path. You should either use
the full path to the command under
/opt/continuent
and
/opt/replicator
, or use the aliases created by
setting the --executable-prefix=mm
option. Calling trepctl would become
mm_trepctl.
Configure your database servers with distinct
auto_increment_increment
and
auto_increment_offset
settings.
Each location that may accept writes should have a unique offset
value.
Using cctrl gives you the dataservice status
individually for the east
and
west
dataservice. For example, the
east
dataservice is shown below:
Continuent Tungsten 7.1.4 build 10
east: session established
[LOGICAL] /east > ls
COORDINATOR[east1:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@east1[17951](ONLINE, created=0, active=0) |
|connector@east2[17939](ONLINE, created=0, active=0) |
|connector@east3[17961](ONLINE, created=0, active=0) |
+----------------------------------------------------------------------------+
DATASOURCES:
+----------------------------------------------------------------------------+
|east1(master:ONLINE, progress=29, THL latency=0.739) |
|STATUS [OK] [2013/11/25 11:24:35 AM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=master, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|east2(slave:ONLINE, progress=29, latency=0.721) |
|STATUS [OK] [2013/11/25 11:24:39 AM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=east1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|east3(slave:ONLINE, progress=29, latency=1.143) |
|STATUS [OK] [2013/11/25 11:24:38 AM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=east1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
When checking the current status, it is import to compare the sequence
numbers from each service correctly. There are four
services to monitor, the Tungsten Cluster service
east
, and a Tungsten Replicator
service east
that reads data from
the west
Tungsten Cluster service. A
corresponding west
Tungsten Cluster
and west
Tungsten Replicator service.
When data is inserted on the Primary within the
east
Tungsten Cluster, use
cctrl to determine the cluster status. Sequence
numbers within the Tungsten Cluster
east
should match, and latency
between hosts in the Tungsten Cluster service are relative to each
other.
When data is inserted on east
,
the sequence number of the east
Tungsten Cluster service and east
Tungsten Replicator service (on
west{1,2,3}
) should be compared.
When data is inserted on the Primary within the
east
Tungsten Cluster, use
cctrl to determine the cluster status. Sequence
numbers within the Tungsten Cluster
east
should match, and latency
between hosts in the Tungsten Cluster service are relative to each
other.
When data is inserted on west
,
the sequence number of the west
Tungsten Cluster service and west
Tungsten Replicator service (on
east{1,2,3}
) should be compared.
Tungsten Cluster Service Seqno | Tungsten Replicator Service Seqno | |||
---|---|---|---|---|
Operation |
east
|
west
|
east
|
west
|
Insert/update data on east
| Seqno Increment | Seqno Increment | ||
Insert/update data on west
| Seqno Increment | Seqno Increment |
Within each cluster, cctrl can be used to monitor the current status. For more information on checking the status and controlling operations, see Section 6.3, “Checking Dataservice Status”.
For convenience, the shell PATH can be updated with the tools and configuration. With two separate services, both environments must be updated. To update the shell with the Tungsten Cluster service and tools:
shell> source /opt/continuent/share/env.sh
To update the shell with the Tungsten Replicator service and tools:
shell> source /opt/replicator/share/env.sh
To monitor all services and the current status, you can also use the multi_trepctl command (part of the Tungsten Replicator installation). This generates a unified status report for all the hosts and services configured:
shell> multi_trepctl --by-service
| host | servicename | role | state | appliedlastseqno | appliedlatency |
| east1 | east | master | ONLINE | 53 | 120.161 |
| east3 | east | master | ONLINE | 44 | 0.697 |
| east2 | east | slave | ONLINE | 53 | 119.961 |
| west1 | east | slave | ONLINE | 53 | 119.834 |
| west2 | east | slave | ONLINE | 53 | 181.128 |
| west3 | east | slave | ONLINE | 53 | 204.790 |
| west1 | west | master | ONLINE | 294327 | 0.285 |
| west2 | west | master | ONLINE | 231595 | 0.316 |
| east1 | west | slave | ONLINE | 294327 | 0.879 |
| east2 | west | slave | ONLINE | 294327 | 0.567 |
| east3 | west | slave | ONLINE | 294327 | 1.046 |
| west3 | west | slave | ONLINE | 231595 | 22.895 |
In the above example, it can be seen that the
west
services have a much higher
applied last sequence number than the
east
services, this is because all
the writes have been applied within the
west
cluster.
To monitor individual servers and/or services, use
trepctl, using the correct port number and servicename.
For example, on east1
to check the
status of the replicator within the Tungsten Cluster service:
shell> trepctl status
To check the Tungsten Replicator service, explicitly specify the port and service:
shell> mm_trepctl -service west status
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
Because there are two different Continuent services running, each must be individually configured to startup on boot:
For the Tungsten Cluster service, use Section 4.4, “Configuring Startup on Boot”.
For the Tungsten Replicator service, a custom startup script must be created, otherwise the replicator will be unable to start as it has been configured in a different directory.
Create a link from the Tungsten Replicator service startup script in
the operating system startup directory
(/etc/init.d
):
shell> sudo ln -s /opt/replicator/tungsten/tungsten-replicator/bin/replicator /etc/init.d/mmreplicator
Modify the APP_NAME
variable
within the startup script
(/etc/init.d/mmreplicator
)
to mmreplicator:
APP_NAME="mmreplicator"
Update the operating system startup configuration to use the updated script.
On Debian/Ubuntu:
shell> sudo update-rc.d mmreplicator defaults
On RedHat/CentOS:
shell> sudo checkconfig --add mmreplicator
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
Under certain conditions, dataservices in an active/active configuration may drift and/or become inconsistent with the data in another dataservice. If this occurs, you may need to re-provision the data on one or more of the dataservices after first determining the definitive source of the information.
In the following example the west
service has been determined to be the definitive copy of the data. To fix
the issue, all the datasources in the
east
service will be reprovisioned
from one of the datasources in the
west
service.
The following is a guide to the steps that should be followed. In the
example procedure it is the
east
service
that has failed:
Put the dataservice into
MAINTENANCE
mode. This
ensures that Tungsten Cluster will not attempt to automatically recover
the service.
cctrl [east]> set policy maintenance
On the east
, failed,
Tungsten Cluster service, put each Tungsten Connector offline:
cctrl [east]> router * offline
Reset the failed Tungsten Replicator service on all servers connected to
the failed Tungsten Cluster service. For example, on
west{1,2,3}
reset the
east
Tungsten Replicator service:
shell west>/opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service east offline
shell west>/opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service east reset -all -y
Reset the Tungsten Cluster service on each server within the failed
region (east{1,2,3}
):
shell east>/opt/continuent/tungsten/tungsten-replicator/bin/replicator stop
shell east>/opt/continuent/tungsten/tools/tpm reset east
shell east>/opt/continuent/tungsten/tungsten-replicator/bin/replicator start
Restore a backup on each host
(east{1,2,3}
) in the failed
east
service from a host in the
west
service:
shell east> /opt/continuent/tungsten/tungsten-replicator/scripts/tungsten_provision_slave \
--direct --source=west1
Place all the Tungsten Replicator services on
west{1,2,3}
back online:
shell west> /opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service east online
On the east
, failed,
Tungsten Cluster service, put each Tungsten Connector online:
cctrl [east]> router * online
Set the policy back to automatic:
cctrl> set policy automatic
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
To reset all of the dataservices and restart the Tungsten Cluster and Tungsten Replicator services:
On all hosts (e.g. east{1,2,3}
and
west{1,2,3}
):
shell>/opt/replicator/tungsten/tungsten-replicator/bin/replicator stop
shell>/opt/replicator/tungsten/tools/tpm reset
shell>/opt/continuent/tungsten/tools/tpm reset
shell>/opt/replicator/tungsten/tungsten-replicator/bin/replicator start
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
In the event of a failure within one host in the service where you need to reprovision the host from another running Replica:
Identify the servers that are failed. All servers that are not the Primary for their region can be re-provisioned using a backup/restore of the Primary (see Section 6.10, “Creating a Backup” or using the tungsten_provision_slave script.
To re-provision an entire region, follow the steps below. The
east
region is used in the
example statements below:
To prevent application servers from reading and writing to the failed service, place the Tungsten Connector offline within the failed region:
cctrl [east]> router * offline
On all servers in other regions
(west{1,2,3}
):
shell>/opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service east offline
shell>/opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service east reset -all -y
On all servers in the failed region
(east{1,2,3}
):
shell>/opt/replicator/tungsten/tungsten-replicator/bin/replicator stop
shell>/opt/replicator/tungsten/tools/tpm reset
shell>/opt/continuent/tungsten/tungsten-replicator/scripts/tungsten_provision_slave \ --direct --source=west1
Check that Tungsten Cluster is working correctly and all hosts are up to date:
cctrl [east]> ls
Restart the Tungsten Replicator service:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/replicator start
On all servers in other regions
(west{1,2,3}
):
shell> /opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service east online
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
To add an entirely new cluster (dataservice) to the mesh, follow the below simple procedure.
There is no need to set the Replicator starting points, and no downtime/maintenance window is required!
Choose a cluster to take a node backup from:
Choose a cluster and Replica node to take a backup from.
Enable maintenance mode for the cluster:
shell>cctrl
cctrl>set policy maintenance
Shun the selected Replica node and stop both local and cross-site replicator services:
shell>cctrl
cctrl>datasource {replica_hostname_here} shun
replica shell>trepctl offline
replica shell>replicator stop
replica shell>mm_trepctl offline
replica; shell>mm_replicator stop
Take a backup of the shunned node, then copy to/restore on all nodes in the new cluster.
Recover the Replica node and put cluster back into automatic mode:
replica shell>replicator start
replica shell>trepctl online
replica shell>mm_replicator start
replica shell>mm_trepctl online
shell>cctrl
cctrl>datasource {replica_hostname_here} online
cctrl>set policy automatic
On ALL nodes in all three (3) clusters, ensure the
/etc/tungsten/tungsten.ini
has all three clusters
defined and all the correct cross-site combinations.
Install the Tungsten Clustering software on new cluster nodes to create a single standalone cluster and check the cctrl command to be sure the new cluster is fully online.
Install the Tungsten Replicator software on all new cluster nodes and start it.
Replication will now be flowing INTO the new cluster from the original two.
On the original two clusters, run tools/tpm update from the cross-site replicator staging software path:
shell>mm_tpm query staging
shell>cd {replicator_staging_directory}
shell>tools/tpm update --replace-release
shell>mm_trepctl online
shell>mm_trepctl services
Check the output from the mm_trepctl services command output above to confirm the new service appears and is online.
There is no need to set the cross-site replicators at a starting position because:
Replicator feeds from the new cluster to the old clusters start at seqno 0.
The tungsten_olda and tungsten_oldb database schemas will contain the correct starting points for the INBOUND feed into the new cluster, so when the cross-site replicators are started and brought online they will read from the tracking table and carry on correctly from the stored position.
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
It is possible to enable secure communications for just the Replicator layer in a Multi-Site/Active-Active topology. This would include both the Cluster Replicators and the Cross-Site Replicators because they cannot be SSL-enabled independently.
Create a certificate and load it into a java keystore, and then load
it into a truststore and place all files into the
/etc/tungsten/
directory. For detailed
instructions, see Chapter 5, Deployment: Security
Update /etc/tungsten/tungsten.ini
to include
these additional lines in the both the defaults
section and the defaults.replicator
section:
[defaults] ...java-keystore-path=/etc/tungsten/keystore.jks java-keystore-password=secret java-truststore-path=/etc/tungsten/truststore.ts java-truststore-password=secret thl-ssl=true
[defaults.replicator] ...java-keystore-path=/etc/tungsten/keystore.jks java-keystore-password=secret java-truststore-path=/etc/tungsten/truststore.ts java-truststore-password=secret thl-ssl=true
Put all clusters into maintenance mode.
shell>cctrl
cctrl>set policy maintenance
On all hosts, update the cluster configuration:
shell>tpm query staging
shell>cd {cluster_staging_directory}
shell>tools/tpm update
shell>trepctl online
shell>trepctl status | grep thl
On all hosts, update the cross-site replicator configuration:
shell>mm_tpm query staging
shell>cd {replicator_staging_directory}
shell>tools/tpm update
shell>mm_trepctl online
shell>mm_trepctl status | grep thl
Please note that all replication will effectively be down until all nodes/services are SSL-enabled and online.
Once all the updates are done and the Replicators are back up and running, use the various commands to check that secure communications have been enabled.
Each datasource will show [SSL]
when enabled:
shell>cctrl
cctrl>ls
DATASOURCES: +----------------------------------------------------------------------------+ |db1(master:ONLINE, progress=208950063, THL latency=0.895) | |STATUS [OK] [2018/04/10 11:47:57 AM UTC][SSL] | +----------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=master, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=15307, active=2) | +----------------------------------------------------------------------------+ +----------------------------------------------------------------------------+ |db2(slave:ONLINE, progress=208950061, latency=0.920) | |STATUS [OK] [2018/04/19 11:18:21 PM UTC][SSL] | +----------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db1, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +----------------------------------------------------------------------------+ +----------------------------------------------------------------------------+ |db3(slave:ONLINE, progress=208950063, latency=0.939) | |STATUS [OK] [2018/04/25 12:17:20 PM UTC][SSL] | +----------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db1, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +----------------------------------------------------------------------------+
Both the local cluster replicator status command trepctl
status and the cross-site replicator status command
mm_trepctl status will show thls
instead of thl
in the values for
masterConnectUri
,
masterListenUri
and
pipelineSource
.
shell> trepctl status | grep thl
masterConnectUri : thls://db1:2112/
masterListenUri : thls://db5:2112/
pipelineSource : thls://db1:2112/
The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.
For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.4, “Deploying Composite Active/Active Clusters”
To perform maintenance on the dataservice, for example to update the MySQL configuration file, can be achieved in a similar sequence to that shown in Section 6.15, “Performing Database or OS Maintenance”, except that you must also restart the corresponding Tungsten Replicator service after the main Tungsten Cluster service has been placed back online.
For example, to perform maintenance on the
east
service:
Put the dataservice into
MAINTENANCE
mode. This
ensures that Tungsten Cluster will not attempt to automatically recover
the service.
cctrl [east]> set policy maintenance
Shun the first Replica datasource so that maintenance can be performed on the host.
cctrl [east]> datasource east1 shun
Perform the updates, such as updating
my.cnf
, changing schemas, or
performing other maintenance.
If MySQL configuration has been modified, restart the MySQL service:
cctrl [east]> service host/mysql restart
Bring the host back into the dataservice:
cctrl [east]> datasource host recover
Perform a switch so that the Primary becomes a Replica and can then be shunned and have the necessary maintenance performed:
cctrl [east]> switch
Repeat the previous steps to shun the host, perform maintenance, and then switch again until all the hosts have been updated.
Set the policy back to automatic:
cctrl> set policy automatic
On each host in the other region, manually restart the Tungsten Replicator service, which will have gone offline when MySQL was restarted:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/trepctl -host host -service east online
In the event of a replication fault, the standard cctrl, trepctl and other utility commands in Chapter 9, Command-line Tools can be used to bring the dataservice back into operation. All the tools are safe to use.
If you have to perform any updates or modifications to the stored MySQL data, ensure binary logging has been disabled using:
mysql> SET SESSION SQL_LOG_BIN=0;
Before running any commands. This prevents statements and operations reaching the binary log so that the operations will not be replicated to other hosts.
A Composite Active/Active (CAA) Cluster topology provides all the benefits of a typical dataservice at a single location, but with the benefit of also replicating the information to another site. The underlying configuration within Tungsten Cluster uses two services within each node; one provides the replication within the cluster, and the second provides replication from the remote cluster. Both are managed by the Tungsten Manager
Composite Active/Active Clusters were previously referred to as Multi-Site/Active-Active (MSAA) clusters. The name has been updated to reflect the nature of these clusters as part of an overall active/active deployment using clusters, where the individual clusters could be in the same or different locations.
Whilst the older Multi-Site/Active-Active topology is still valid and supported, it is recommended that this newer Composite Active/Active topology is adopted from version 6 of Tungsten Cluster onwards. For details on the older topology, see Section 3.3, “Deploying Multi-Site/Active-Active Clustering”
The configuration is handled with a single configuration and deployment that configures the core cluster services and additional cross-cluster services.
A sample display of how this operates is provided in Figure 3.4, “Topologies: Composite Active/Active Clusters”.
The service can be described as follows:
Tungsten Cluster Service:
east
Replicates data between east1
,
east2
and
east3
.
Tungsten Cluster Service:
west
Replicates data between west1
,
west2
and
west3
.
Tungsten Cluster Service:
west_from_east
Defines the replication service using a secondary sub-service within the cluster.
This service reads THL FROM
east
and writes to the relay
node in west
, subsequently, the replica
nodes within west
are then replicated to from there.
Tungsten Replicator Service:
east_from_west
Defines the replication service using a secondary sub-service within the cluster.
This service reads THL FROM
west
and writes to the relay
node in east
, subsequently, the replica
nodes within east
are then replicated to from there.
A new Composite Dynamic Active/Active topology was introduced from version 7.0.0 of Tungsten Cluster
Composite Dynamic Active/Active builds on the foundation of the Composite Active/Active topology and the cluster continues to operate and be configured in the same way.
The difference is, with Composite Dynamic Active/Active, the cluster instructs the Proxy layer to behave like a Composite Active/Passive cluster.
For more information on this topology and how to enable it, see Section 3.5, “Deploying Composite Dynamic Active/Active”
Some considerations must be taken into account for any active/active scenarios:
For tables that use auto-increment, collisions are possible if two
hosts select the same
auto-increment
number. You can
reduce the effects by configuring each MySQL host with a different
auto-increment settings, changing the offset and the increment values.
For example, adding the following lines to your
my.cnf
file:
auto-increment-offset = 1
auto-increment-increment = 4
In this way, the increments can be staggered on each machine and collisions are unlikely to occur.
Use row-based replication. Update your configuration file to
explicitly use row-based replication by adding the following to your
my.cnf
file:
binlog-format = row
Beware of triggers. Triggers can cause problems during replication because if they are applied on the Replica as well as the Primary you can get data corruption and invalid data. Tungsten Cluster cannot prevent triggers from executing on a Replica, and in an active/active topology there is no sensible way to disable triggers. Instead, check at the trigger level whether you are executing on a Primary or Replica. For more information, see Section C.4.1, “Triggers”.
Deployment of Composite Active/Active clusters is only supported using the INI method of deployment.
Configuration and deployment of the cluster works as follows:
Creates two basic Primary/Replica clusters.
Creates a composite service that includes the Primary/Replica clusters within the definition.
The resulting configuration within the example builds the following deployment:
One cluster, east
, with three
hosts.
One cluster, west
, with three
hosts.
All six hosts in the two clusters will have a manager, replicator and connector installed.
Each replicator has two replication services, one service that replicates the data within the cluster. The second service, replicates data from the other cluster to this host.
Creating the full topology requires a single install step, this creates the Tungsten Cluster cluster dataservices, and creates the Composite dataservices on different network ports to allow for the cross-cluster replication to operate.
Create the combined configuration file
/etc/tungsten/tungsten.ini
on all cluster
hosts:
shell> vi /etc/tungsten/tungsten.ini
[defaults] user=tungsten install-directory=/opt/continuent profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 application-user=app_user application-password=secret application-port=3306 rest-api-admin-user=apiuser rest-api-admin-pass=secret
[east] topology=clustered master=east1 members=east1,east2,east3 connectors=east1,east2,east3
[west] topology=clustered master=west1 members=west1,west2,west3 connectors=west1,west2,west3
[usa] topology=composite-multi-master composite-datasources=east,west
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
System User
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
profile-script=~/.bash_profile
Append commands to include env.sh in this profile script
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
The password to be used when connecting to the database using
the corresponding
--replication-user
.
The network port used to connect to the database server. The default port used depends on the database being configured.
Database username for the connector
Database password for the connector
Port for the connector to listen on
Configuration group east
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
Hostnames for the dataservice connectors
Configuration group west
The description of each of the options is shown below; click the icon to hide this detail:
Replication topology for the dataservice.
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
Hostnames for the dataservice connectors
Configuration group usa
The description of each of the options is shown below; click the icon to hide this detail:
topology=composite-multi-master
Replication topology for the dataservice.
composite-datasources=east,west
Data services that should be added to this composite data service
The configuration above defines two clusters,
east
and
west
, which are both part of a
composite cluster service,
usa
. Configuration can be
divided up into the four sections shown, as follows:
If you plan to make full use of the REST API (which is enabled by default) you will need to also configure a username and password for API Access. This must be done by specifying the following options in your configuration:
rest-api-admin-user=tungsten rest-api-admin-pass=secret
Service names should not contain the keyword
from
within a
Composite Active/Active deployment. This keyword is used (with the
underscore separator, for example,
east_from_west
to denote cross-site replicators within the cluster. To avoid
confusion, avoid using from
so that it is easy to distinguish between replication pipelines.
When configuring this service, tpm will automatically imply the following into the configuration:
A parent composite service,
usa
in this example, with
child services as listed,
east
and
west
.
Replication services between each child service, using the
service name
a_from_b
,
for example,
east_from_west
and
west_from_east
.
More child services will create more automatic replication
services. For example, with three clusters,
alpha
,
beta
, and
gamma
,
tpm would configure
alpha_from_beta
and
alpha_from_gamma
on the
alpha cluster,
beta_from_alpha
and
beta_from_gamma
on the
beta cluster, and so on.
For each additional service, the port number is automatically
configured from the base port number for the first service. For
example, using the default port 2112, the
east_from_west
service would have THL port 2113.
Execute the installation on each host within the entire composite cluster. For example, on all six hosts provided in the sample configuration above.
Install the Tungsten Cluster package (.rpm
),
or download the compressed tarball and unpack it:
shell>cd /opt/continuent/software
shell>tar zxf
tungsten-clustering-7.1.4-10.tar.gz
Change to the Tungsten Cluster staging directory:
shell> cd tungsten-clustering7.1.4-10
Run tpm to install the Clustering software:
shell > ./tools/tpm install
During the installation and startup, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
Initialize your PATH
and environment:
shell> source /opt/continuent/share/env.sh
The Composite Active/Active clustering should be installed and ready to use.
In addition to this information, follow the guidelines in Section 2.5, “Best Practices”.
Running a Composite Active/Active service uses many different components to keep data updated on all servers. Monitoring the dataservice is divided into monitoring the two different clusters and each cluster sub-service cluster responsible for replication to/from remote clusters.
Configure your database servers with distinct
auto_increment_increment
and
auto_increment_offset
settings.
Each location that may accept writes should have a unique offset
value.
Using cctrl gives you the dataservice status. By default, cctrl will connect you to the custer associated with the node that you issue the command from. To start at the top level, issue cctrl -multi instead
At the top level, the composite cluster output shows the composite service, composite cluster members and replication services:
Tungsten Clustering 7.1.4 build 10 east: session established, encryption=false, authentication=false [LOGICAL] / > ls usa east east_from_west west
To examine the overall composite cluster status, change to the composite cluster and use ls:
[LOGICAL] / >use usa
[LOGICAL] /usa >ls
COORDINATOR[west3:AUTOMATIC:ONLINE] east:COORDINATOR[east3:AUTOMATIC:ONLINE] west:COORDINATOR[west3:AUTOMATIC:ONLINE] ROUTERS: +---------------------------------------------------------------------------------+ |connector@east1[10583](ONLINE, created=0, active=0) | |connector@east2[10548](ONLINE, created=0, active=0) | |connector@east3[10540](ONLINE, created=0, active=0) | |connector@west1[10589](ONLINE, created=0, active=0) | |connector@west2[10541](ONLINE, created=0, active=0) | |connector@west3[10547](ONLINE, created=0, active=0) | +---------------------------------------------------------------------------------+ DATASOURCES: +---------------------------------------------------------------------------------+ |east(composite master:ONLINE, global progress=1, max latency=3.489) | |STATUS [OK] [2019/12/24 10:21:08 AM UTC] | +---------------------------------------------------------------------------------+ | east(master:ONLINE, progress=1, max latency=1.483) | | east_from_west(relay:ONLINE, progress=1, max latency=3.489) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |west(composite master:ONLINE, global progress=1, max latency=0.909) | |STATUS [OK] [2019/12/24 10:21:08 AM UTC] | +---------------------------------------------------------------------------------+ | west(master:ONLINE, progress=1, max latency=0.909) | | west_from_east(relay:ONLINE, progress=1, max latency=0.903) | +---------------------------------------------------------------------------------+
For each cluster within the composite cluster, four lines of information are provided:
|east(composite master:ONLINE, global progress=1, max latency=3.489) |
This line indicates:
The name and type of the composite cluster, and whether the Primary in the cluster is online.
The global progress. This is a counter that combines the local
progress of the cluster, and the replication of data from this
cluster to the remote clusters in the composite to this cluster.
For example, if data is inserted into
west
The maximum latency within the cluster.
|STATUS [OK] [2019/12/24 10:21:08 AM UTC] |
The status and date within the Primary of the cluster.
| east(master:ONLINE, progress=1, max latency=1.483) |
The status and progress of the cluster.
| east_from_west(relay:ONLINE, progress=1, max latency=3.489) |
The status and progress of remote replication from the cluster.
The global progress
and the
progress
work together to provide an indication of
the overall replication status within the composite cluster:
Inserting data into the Primary on
east
will:
Increment the progress
within the
east
cluster.
Increment the global progress
within the
east
cluster.
Inserting data into the Primary on
west
will:
Increment the progress
within the
west
cluster.
Increment the global progress
within the
west
cluster.
Looking at the individual cluster shows only the cluster status, not the cross-cluster status:
[LOGICAL] /east > ls
COORDINATOR[east3:AUTOMATIC:ONLINE]
ROUTERS:
+---------------------------------------------------------------------------------+
|connector@east1[10583](ONLINE, created=0, active=0) |
|connector@east2[10548](ONLINE, created=0, active=0) |
|connector@east3[10540](ONLINE, created=0, active=0) |
|connector@west1[10589](ONLINE, created=0, active=0) |
|connector@west2[10541](ONLINE, created=0, active=0) |
|connector@west3[10547](ONLINE, created=0, active=0) |
+---------------------------------------------------------------------------------+
DATASOURCES:
+---------------------------------------------------------------------------------+
|east1(master:ONLINE, progress=1, THL latency=0.765) |
|STATUS [OK] [2019/12/24 10:21:12 AM UTC] |
+---------------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=master, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|east2(slave:ONLINE, progress=1, latency=0.826) |
|STATUS [OK] [2019/12/24 10:21:13 AM UTC] |
+---------------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=east1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|east3(slave:ONLINE, progress=1, latency=0.842) |
|STATUS [OK] [2019/12/24 10:21:12 AM UTC] |
+---------------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=east1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+---------------------------------------------------------------------------------+
Within each cluster, cctrl can be used to monitor the current status. For more information on checking the status and controlling operations, see Section 6.3, “Checking Dataservice Status”.
To monitor all services and the current status, you can also use the multi_trepctl command (part of the Tungsten Replicator installation). This generates a unified status report for all the hosts and services configured:
shell> multi_trepctl --by-service
| host | servicename | role | state | appliedlastseqno | appliedlatency |
| east1 | east | master | ONLINE | 5 | 0.440 |
| east2 | east | slave | ONLINE | 5 | 0.538 |
| east3 | east | slave | ONLINE | 5 | 0.517 |
| east1 | east_from_west | relay | ONLINE | 23 | 0.074 |
| east2 | east_from_west | slave | ONLINE | 23 | 0.131 |
| east3 | east_from_west | slave | ONLINE | 23 | 0.111 |
| west1 | west | master | ONLINE | 23 | 0.021 |
| west2 | west | slave | ONLINE | 23 | 0.059 |
| west3 | west | slave | ONLINE | 23 | 0.089 |
| west1 | west_from_east | relay | ONLINE | 5 | 0.583 |
| west2 | west_from_east | slave | ONLINE | 5 | 0.562 |
| west3 | west_from_east | slave | ONLINE | 5 | 0.592 |
In the above example, it can be seen that the
west
services have a higher
applied last sequence number than the
east
services, this is because all
the writes have been applied within the
west
cluster.
To monitor individual servers and/or services, use
trepctl, using the correct servicename.
For example, on east1
to check the
status of the replicator within the Tungsten Cluster service, use the
trepctl services command to get the status of both the
local and cross-cluster services:
shell> trepctl status
Processing services command...
NAME VALUE
---- -----
appliedLastSeqno: 6
appliedLatency : 0.43
role : master
serviceName : east
serviceType : local
started : true
state : ONLINE
NAME VALUE
---- -----
appliedLastSeqno: 4
appliedLatency : 1837.999
role : relay
serviceName : east_from_west
serviceType : local
started : true
state : ONLINE
Finished services command...
To get a more detailed status, you must explicitly specify the service
shell> trepctl -service east_from_west status
For the Tungsten Cluster service, use Section 4.4, “Configuring Startup on Boot”.
Under certain conditions, dataservices in an active/active configuration may drift and/or become inconsistent with the data in another dataservice. If this occurs, you may need to re-provision the data on one or more of the dataservices after first determining the definitive source of the information.
In the following example the west
service has been determined to be the definitive copy of the data. To fix
the issue, all the datasources in the
east
service will be reprovisioned
from one of the datasources in the
west
service.
The following is a guide to the steps that should be followed. In the
example procedure it is the
east
service
that has failed:
Put the dataservice into
MAINTENANCE
mode. This
ensures that Tungsten Cluster will not attempt to automatically recover
the service.
cctrl [east]> set policy maintenance
On the east
, failed,
Tungsten Cluster service, put each Tungsten Connector offline:
cctrl [east]> router * offline
Reset the local failed service on all servers connected to
the remote failed service. For example, on
west{1,2,3}
reset the
west_from_east
service:
shell west>trepctl -service west_from_east offline
shell west>trepctl -service west_from_east reset -all -y
Reset the local service on each server within the failed
region (east{1,2,3}
):
shell east>trepctl -service east offline
shell east>trepctl -service east reset -all -y
Restore a backup on each host
(east{1,2,3}
) in the failed
east
service from a host in the
west
service:
shell east> tungsten_provision_slave \
--direct --source=west1
Place all the services on
west{1,2,3}
back online:
shell west> trepctl -service west_from_east online
On the east
, failed,
Tungsten Cluster service, put each Tungsten Connector online:
cctrl [east]> router * online
Set the policy back to automatic:
cctrl> set policy automatic
To reset all of the dataservices and restart the Tungsten Cluster services:
On all hosts (e.g. east{1,2,3}
and
west{1,2,3}
):
shell>replicator stop
shell>tpm reset
shell>replicator start
To perform maintenance on the dataservice, for example to update the MySQL configuration file, can be achieved in a similar sequence to that shown in Section 6.15, “Performing Database or OS Maintenance”, except that you must also restart the corresponding Tungsten Replicator service after the main Tungsten Cluster service has been placed back online.
For example, to perform maintenance on the
east
service:
Put the dataservice into
MAINTENANCE
mode. This
ensures that Tungsten Cluster will not attempt to automatically recover
the service.
cctrl [east]> set policy maintenance
Shun the first Replica datasource so that maintenance can be performed on the host.
cctrl [east]> datasource east1 shun
Perform the updates, such as updating
my.cnf
, changing schemas, or
performing other maintenance.
If MySQL configuration has been modified, restart the MySQL service:
cctrl [east]> service host/mysql restart
Bring the host back into the dataservice:
cctrl [east]> datasource host recover
Perform a switch so that the Primary becomes a Replica and can then be shunned and have the necessary maintenance performed:
cctrl [east]> switch
Repeat the previous steps to shun the host, perform maintenance, and then switch again until all the hosts have been updated.
Set the policy back to automatic:
cctrl> set policy automatic
On each host in the other region, manually restart the Tungsten Replicator service, which will have gone offline when MySQL was restarted:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/trepctl -host host -service east online
In the event of a replication fault, the standard cctrl, trepctl and other utility commands in Chapter 9, Command-line Tools can be used to bring the dataservice back into operation. All the tools are safe to use.
If you have to perform any updates or modifications to the stored MySQL data, ensure binary logging has been disabled using:
mysql> SET SESSION SQL_LOG_BIN=0;
Before running any commands. This prevents statements and operations reaching the binary log so that the operations will not be replicated to other hosts.
In a cmm_name; topology, a switch or a failover not only promotes a
Replica to be a new Primary, but also will require the ability to
reconfigure cross-site communications. This process therefore assumes
that cross-site communication is online and working. In some
situations, it may be possible that cross-site communication is down,
or for some reason cross-site replication is in an
OFFLINE:ERROR
state - for
example a DDL or DML statement that worked in the local cluster may
have failed to apply in the remote.
If a switch or failover occurs and the process is unable to
reconfigure the cross-site replicators, the local switch will still
succeed, however the associated cross-site services will be placed
into a
SHUNNED(SUBSERVICE-SWITCH-FAILED)
state.
The guide explains how to recover from this situation.
The examples are based on a 2-cluster topology, named
NYC
and
LONDON
and the composite
dataservice named GLOBAL
.
The cluster is configured with the following dataservers:
NYC : db1 (Primary), db2 (Replica),
db3 (Replica)
LONDON: db4 (Primary), db5 (Replica),
db6 (Replica)
The cross site replicators in both clusters are in an
OFFLINE:ERROR
state due to
failing DDL.
A switch was then issued, promoting db3 as the new Primary in NYC
and db5 as the new Primary in
LONDON
When the cluster enters a state where the cross-site services are in an error, output from cctrl will look like the following:
shell>cctrl -expert -multi
[LOGICAL:EXPERT] / >use london_from_nyc
london_from_nyc: session established, encryption=false, authentication=false [LOGICAL:EXPERT] /london_from_nyc >ls
COORDINATOR[db6:AUTOMATIC:ONLINE] ROUTERS: +---------------------------------------------------------------------------------+ |connector@db1[26248](ONLINE, created=0, active=0) | |connector@db2[14906](ONLINE, created=0, active=0) | |connector@db3[15035](ONLINE, created=0, active=0) | |connector@db4[27813](ONLINE, created=0, active=0) | |connector@db5[4379](ONLINE, created=0, active=0) | |connector@db6[2098](ONLINE, created=0, active=0) | +---------------------------------------------------------------------------------+ DATASOURCES: +---------------------------------------------------------------------------------+ |db5(relay:SHUNNED(SUBSERVICE-SWITCH-FAILED), progress=6, latency=0.219) | |STATUS [SHUNNED] [2018/03/15 10:27:24 AM UTC] | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=relay, master=db3, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |db4(slave:SHUNNED(SUBSERVICE-SWITCH-FAILED), progress=6, latency=0.252) | |STATUS [SHUNNED] [2018/03/15 10:27:25 AM UTC] | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db5, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |db6(slave:SHUNNED(SUBSERVICE-SWITCH-FAILED), progress=6, latency=0.279) | |STATUS [SHUNNED] [2018/03/15 10:27:25 AM UTC] | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db4, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +---------------------------------------------------------------------------------+
In the above example, you can see that all services are in the
SHUNNED(SUBSERVICE-SWITCH-FAILED)
state, and partial reconfiguration has happened.
The Replicators for db4 and db6 should be Replicas of db5, db5 has correctly configured to the new Primary in nyc, db3. The actual state of the cluster in each scenario maybe different depending upon the cause of the loss of cross-site communication. Using the steps below, apply the necessary actions that relate to your own cluster state, if in any doubt always contact Continuent Support for assistance.
The first step is to ensure the initial replication errors have been resolved and that the replicators are in an online state, the steps to resolve the replicators will depend on the reason for the error, for further guidance on resolving these issues, see Chapter 6, Operations Guide.
From one node, connect into cctrl at the expert level:
shell> cctrl -expert -multi
Next, connect to the cross-site subservice, in this example, london_from_nyc
cctrl> use london_from_nyc
Next, place the service into Maintenance Mode
cctrl> set policy maintenance
Enable override of commands issued
cctrl> set force true
Bring the relay datasource online
cctrl> datasource db5 online
If you need to change the source for the relay replicator to the correct, new, Primary in the remote cluster, take the replicator offline. If the relay source is correct, then move on to step 10
cctrl> replicator db5 offline
Change the source of the relay replicator
cctrl> replicator db5 relay nyc/db3
Bring the replicator online
cctrl> replicator db5 online
For each datasource that requires the replicator altering, issue the following commands:
cctrl>replicator
cctrl>datasource
offlinereplicator
cctrl>datasource
slave db5replicator
datasource
online
For example:
cctrl>replicator db4 offline
cctrl>replicator db4 slave db5
cctrl>replicator db4 online
Once all replicators are using the correct source, we can then bring the cluster back
cctrl> cluster welcome
Some of the datasources may still be in the SHUNNED state, so for each of those, you can then issue the following
cctrl> datasource datasource
online
For example:
cctrl> datasource db4 online
Once all nodes are online, we can then return the cluster to automatic
cctrl> set policy automatic
Repeat this process for the other cross-site subservice if required
Direct link video.
This procedure explains how to add additional clusters to an existing v6.x (or newer) Composite Active/Active configuration.
The example in this procedure adds a new 3-node cluster consisting of nodes db7, db8 and db9 within a service called Tokyo. The existing cluster contains two dataservices, NYC and London, made up of nodes db1, db2, db3 and db4, db5, db6 respectively.
Ensure the new nodes have all the necessary pre-requisites in place, specifically paying attention to the following:
MySQL auto_increment parameters set appropriately on existing and new clusters
All new nodes have full connectivity to the existing nodes and the hosts file contains correct hostnames
All existing nodes have full connectivity to the new nodes and hosts file contains correct hostnames
We need to provision all the new nodes in the new cluster with a backup taken from one node in any of the existing clusters. In this example we are using db6 in the London dataservice as the source for the backup.
Shun and stop the services on the node used for the backup
db6-shell>cctrl
cctrl>datasource db6 shun
cctrl>replicator db6 offline
cctrl>exit
db6-shell>stopall
db6-shell>sudo service mysqld stop
Next, use whichever method you wish to copy the mysql datafiles from db6 to all the nodes in the new cluster (scp, rsync, xtrabackup etc). Ensure ALL database files are copied.
Once backup copied across, restart the services on db6
db6-shell>sudo service mysqld start
db6-shell>startall
db6-shell>cctrl
cctrl>datasource db6 recover
cctrl>exit
Ensure all files copied to the target nodes have the correct file ownership
Start mysql on the new nodes
Next we need to change the configuration on the existing hosts to include the configuration of the new cluster.
You need to add a new service block that includes the new nodes and append
the new service to the composite-datasource parameter in the composite
dataservice, all within /etc/tungsten/tungsten.ini
Example of a new service block and composite-datasource change added to existing hosts configuration:
[tokyo] topology=clustered master=db7 members=db7,db8,db9 connectors=db7,db8,db9 [global] topology=composite-multi-master composite-datasources=nyc,london,tokyo
To avoid any differences in configuration, once the changes have been made to the tungsten.ini on the existing hosts, copy this file from one of the nodes to all the nodes in the new cluster.
Ensure start-and-report
is
false
or not set in the config.
On the 3 new nodes, validate the software:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm validate
This may produce Warnings that the tracking schemas for the existing cluster already exist - this is OK and they can be ignored. Assuming no other unexpected errors are reported, then go ahead and install the software:
shell> tools/tpm install
Before we start the new cluster, we now need to update the existing clusters
Put entire cluster into MAINTENANCE
shell>cctrl
cctrl>use {composite-dataservice}
cctrl>set policy maintenance
cctrl>ls
COORDINATOR[db3:MAINTENANCE:ONLINE] london:COORDINATOR[db4:MAINTENANCE:ONLINE] nyc:COORDINATOR[db3:MAINTENANCE:ONLINE] cctrl>exit
Update the software on each node. This needs to be executed from the software staging directory using the replace-release option as this will ensure the new cross-site dataservices are setup correctly. Update the Primaries first followed by the Replicas, cluster by cluster:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release
On all the nodes in the new cluster, start the software:
shell> startall
Using cctrl, check that the new cluster appears and that all services are correctly showing online, it may take a few moments for the cluster to settle down and start everything
shell>cctrl
cctrl>use {composite-dataservice}
cctrl>ls
cctrl>exit
Check the output of trepctl and ensure all replicators are online and new cross-site services appear in the pre-existing clusters
shell>trepctl -service {service} status
shell>trepctl services
Place entire cluster back into AUTOMATIC
shell>cctrl
cctrl>use {composite-dataservice}
cctrl>set policy automatic
cctrl>ls
COORDINATOR[db2:AUTOMATIC:ONLINE] london:COORDINATOR[db5:AUTOMATIC:ONLINE] nyc:COORDINATOR[db2:AUTOMATIC:ONLINE] cctrl>exit
Composite Dynamic Active/Active builds on the foundation of the Composite Active/Active topology and the cluster continues to operate and be configured in the same way.
The difference is, with Composite Dynamic Active/Active, the cluster instructs the Proxy layer to behave like a Composite Active/Passive cluster.
Within your configuration you specify write affinity to a single cluster, meaning that all reads will continue to balance between local replicas, but all writes will be directed to only one cluster.
The diagram below shows how a Composite Dynamic Active/Active would behave in a typical 2-cluster configuration.
The benefit of a Composite Dynamic Active/Active cluster and being able to direct writes to only one cluster, avoids all the inherent risks of a true Active/Active deployment, such as conflicts when the same row is altered in both clusters.
This is especially useful for deployments that do not have the ability to avoid potential conflicts programmatically.
The additional benefit this topology offers is instant failover of writes in the event of a cluster failure. In Composite Dynamic Active/Active if the cluster with write affinity fails, writes instantly failover to the other cluster, and because that cluster is open for writes, applications will continue uninterrupted. This differs from a Composite Active/Passive where in the event of a cluster failure there needs to be a manual failover process to re-route write operations.
To use Composite Dynamic Active/Active you need to have a Composite Active/Active cluster deployed, then it is simply a case of specifying the required affinity within the connectors.
For the purpose of this example we will assume we have two clusters
alpha
and beta
. Each cluster will
have two connectors and it is desired that the alpha
cluster be the primary write destination.
Within the configuration for the connectors, add the following:
=> On alpha nodes: connector-write-affinity=alpha connector-read-affinity=alpha => On beta nodes: connector-write-affinity=alpha connector-read-affinity=beta
This will have the effect of setting the write affinity to the
alpha
cluster primarily on both
alpha
and beta
clusters as follows:
alpha
cluster will get both read and write affinity
to alpha
beta
cluster will get write affinity to
alpha
, but maintain read affinity to
beta
After recovering a failed site
As outlined above, if the site that has write affinity fails, read-write traffic will failover to another site based on the affinity rules configured. Following recovery of the site that is configured as the primary write site, new connections will follow the write affinity rules, whereas existing connections will remain on the site that was promoted after failover.
To maintain data-integrity and to ensure writes continue to only be directed to a single site, it is therefore essential to also enable the following tpm property:
--connector-reset-when-affinity-back=true
With this enabled, following recovery of the primary write site, all connections (new and old) will revert to the original, intended, cluster configured with primary write affinity.
In the case of the alpha
cluster failing, the writes
will failover and redirect to the beta
cluster.
Testing DAA in Bridge Mode
When using Bridge mode (the default at install), all requests are routed to the Primary by default. To test query routing, run the following query when connected through the Connector:
Route to the Primary:
mysql> select @@hostname;
In Bridge mode, the only way to verify that reads are being directed to
replicas is to establish a read-only port and execute queries through it
to force the QoS RO_RELAXED
.
First, ensure that your INI file has the following option, then run tpm update
connector-readonly-listen-port=3307
To test, ensure you connect to the specified read-only port:
Route to a Replica: shell>Testing DAA in Proxy Mode with No R/W Splitting Enabledmysql -h... -P3307
mysql>select @@hostname;
To test Connector query routing in Proxy mode, you may use the URL-based R/W splitting to test query routing:
Route to the Primary: shell>Testing DAA in Proxy Mode with R/W Splitting Enabled (SmartScale or @direct)mysql -h... -Dtest@qos=RW_STRICT -e "select @@hostname;"
Route to a Replica: shell>mysql -h... -Dtest@qos=RO_RELAXED -e "select @@hostname;"
To test Connector query routing in Proxy mode when either SmartScale or @direct read/write splitting has been enabled, you may use the following:
Route to the Primary: mysql>select @@hostname for update;
Route to a Replica: mysql>select @@hostname;
Manual Site-Level Switch
For DAA to work properly, all writes must go to one cluster or another, no exceptions. When you want to move all writes to another site/cluster (like you would in a Composite Active/Passive cluster using the switch command at the composite level), there is no switch command available in Dynamic Active/Active.
As of version 7.0.2, we strongly recommend that you use the cctrl command datasource SERVICE drain [optional timeout in seconds] at the composite level to shun the currently selected Active cluster. This will allow the Connector to finish (drain) all in-flight queries, shun the composite dataservice once fully drained, and then move all writes to another cluster.
Please note that this is different than using the cctrl
command datasource SERVICE shun
(available prior to version 7.0.2) at the composite level to shun the
currently selected Active cluster. Using shun
instead
of drain
will force the Connector
to immediately sever/terminate all in-flight queries, then move
all writes to another cluster.
shell>cctrl -multi
Tungsten Clustering 7.0.2 build 145 beta: session established, encryption=true, authentication=true jgroups: encrypted, database: encrypted [LOGICAL] / >use world
[LOGICAL] /world >datasource alpha drain 30
WARNING: This is an expert-level command: Incorrect use may cause data corruption or make the cluster unavailable. Do you want to continue? (y/n)>y
composite data source 'alpha' is now SHUNNED [LOGICAL] /world >exit
Exiting...
When you are ready to resume writes to the originally-configured site, use
the composite-level cctrl command
datasource SERVICE welcome. If you
have set
--connector-reset-when-affinity-back=true
,
then writes will move back to the original site. If set to
false
, the writes will stay where they are.
shell>cctrl -multi
Tungsten Clustering 7.0.2 build 145 beta: session established, encryption=true, authentication=true jgroups: encrypted, database: encrypted [LOGICAL] / >use world
[LOGICAL] / >datasource alpha welcome
WARNING: This is an expert-level command: Incorrect use may cause data corruption or make the cluster unavailable. Do you want to continue? (y/n)>y
composite data source 'alpha' is now ONLINE [LOGICAL] /world >exit
Exiting...
For more information about the datasource shun command, please visit: Section 9.1.3.5.9, “cctrl datasource shun Command”
For more information about the datasource drain command, please visit: Section 9.1.3.5.3, “cctrl datasource drain Command”
An independent Tungsten Connector installation can be useful when you want to create a connector service that provides HA and load balancing, but which operates independently of the main cluster. Specifically, this solution is used within disaster recovery and multi-site operations where the connector may be operating across site-boundaries independently of the dataservice at each site.
The independent nature is in terms of the configuration of the overall service through tpm; an independent connector configured to communicate with existing cluster hosts will be managed by the managers of the cluster. But, the connector will not be updated when performing a tpm update operation within the configured cluster. This allows the connector to work through upgrade procedures to minimize downtime.
To create an independent connector, tpm is used to create a definition for a cluster including the datasources, and specifying only a single connector host, then installing Tungsten Cluster on only the connector host. Failure to configure in this way, and tpm will install a full Tungsten Cluster service across all the implied members of the cluster.
Install the Tungsten Cluster package or download the Tungsten Cluster tarball, and unpack it:
shell>cd /opt/continuent/software
shell>tar zxf
tungsten-clustering-7.1.4-10.tar.gz
Change to the Tungsten Cluster directory:
shell> cd tungsten-clustering-7.1.4-10
Run tpm to perform the installation, using either the staging
method or the INI method. Review Section 10.1, “Comparing Staging and INI
tpm Methods”
for more details on these two methods.
Click the link below to switch examples between Staging and INI methods
shell>./tools/tpm configure defaults \ --reset \ --user=tungsten \ --profile-script=~/.bashrc \ --application-user=app-_user \ --application-password=secret \ --application-port=3306 \ --replication-port=13306 \ --install-directory=/opt/continuent
shell>./tools/tpm configure alpha \ --connectors=connectorhost1 \ --master=host1 \ --members=host1,host2,host3
shell> vi /etc/tungsten/tungsten.ini
[defaults] user=tungsten profile-script=~/.bashrc application-user=app-_user application-password=secret application-port=3306 replication-port=13306 install-directory=/opt/continuent
[alpha] connectors=connectorhost1 master=host1 members=host1,host2,host3
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
For staging configurations, deletes all pre-existing configuration information between updating with the new configuration values.
System User
Append commands to include env.sh in this profile script
Database username for the connector
Database password for the connector
Port for the connector to listen on
The network port used to connect to the database server. The default port used depends on the database being configured.
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
Hostnames for the dataservice connectors
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
The above creates a configuration specifying the datasources,
host{1,2,3}
, and a single
connector host based on the hostname of the installation host. Note that
the application and datasource port configuration are the same as
required by a typical Tungsten Cluster configuration. The values above are
identical to those used in Section 3.1, “Deploying Standalone HA Clusters”
deployment.
Run tpm to install the software with the configuration.
shell > ./tools/tpm install
During the startup and installation, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
Initialize your PATH
and environment.
shell > source /opt/continuent/share/env.sh
Start the connector service:
shell> connector start
Once started:
The connector will appear, and be managed by, any manager host using the cctrl tool. For example:
[LOGICAL] /dsone > ls
COORDINATOR[host1:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@connector2[16019](ONLINE, created=0, active=0) |
|connector@host1[18450](ONLINE, created=19638, active=0) |
|connector@host2[1995](ONLINE, created=0, active=0) |
|connector@host3[8895](ONLINE, created=0, active=0) |
+----------------------------------------------------------------------------+
...
The active status of the connector can be monitored using cctrl as normal.
Updates to the main cluster will not update the Tungsten Cluster of the standalone connector. The standalone must be updated independently of the remainder of the Tungsten Cluster dataservice.
Connector can be accessed using the connector host and specified port:
shell> mysql -utungsten -p -hconnector -P3306
The user.map
authorization file must be created and
managed separately on standalone connectors. For more information, see
Section 7.6, “User Authentication”
Ensure the new host that is being added has been configured following the Appendix B, Prerequisites.
Update the configuration using tpm, adding the new
host to the list of --members
,
--hosts
, and
--connectors
, if applicable.
If using the staging method of deployment, you can use
+=
, which appends the host to
the existing deployment as shown in the example below. Click the link to switch
between staging and ini type deployment examples.
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--members+=host4 \
--hosts+=host4 \
--connectors+=host4 \
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update --no-connectors
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
shell> vi /etc/tungsten/tungsten.ini
[alpha]
...
members=host1,host2,host3,host4
hosts=host1,host2,host3,host4
connectors=host1,host2,host3,host4
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update --no-connectors
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
Using the --no-connectors
option
updates the current deployment without restarting the existing
connectors.
Initially, the newly added host will attempt to read the information from the existing THL. If the full THL is not available from the Primary, the new Replica will need to be reprovisioned:
Log into the new host.
Execute tprovision to read the information from an existing Replica and overwrite the data within the new host:
shell> tprovision --source=host2
NOTE >>Put alpha replication service offline
NOTE >>Create a mysqldump backup of host2 in /opt/continuent/backups/provision_mysqldump_2019-01-17_17-27_96
NOTE >>host2>>Create mysqldump in /opt/continuent/backups/provision_mysqldump_2019-01-17_17-27_96/provision.sql.gz
NOTE >>Load the mysqldump file
NOTE >>Put the alpha replication service online
NOTE >>Clear THL and relay logs for the alpha replication service
Once the new host has been added and re-provision, check the status in cctrl:
[LOGICAL] /alpha > ls
COORDINATOR[host1:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[11401](ONLINE, created=0, active=0) |
|connector@host2[8756](ONLINE, created=0, active=0) |
|connector@host3[21673](ONLINE, created=0, active=0) |
+----------------------------------------------------------------------------+
DATASOURCES:
+----------------------------------------------------------------------------+
|host1(master:ONLINE, progress=219, THL latency=1.047) |
|STATUS [OK] [2018/12/13 04:16:17 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=master, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host2(slave:ONLINE, progress=219, latency=1.588) |
|STATUS [OK] [2018/12/13 04:16:17 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host3(slave:ONLINE, progress=219, latency=2.021) |
|STATUS [OK] [2018/12/13 04:16:18 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host4(slave:ONLINE, progress=219, latency=1.000) |
|STATUS [OK] [2019/01/17 05:28:54 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host1, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
If the host has not come up, or the progress does not match the Primary, check Section 6.6, “Datasource Recovery Steps” for more information on determining the exact status and what steps to take to enable the host operation.
To add active witnesses to an Existing Deployment, use tpm to update the configuration, adding the list of active witnesses and the list of all members within the updated dataservice configuration.
Active Witness hosts must have been prepared using the notes provided in Appendix B, Prerequisites. Active witnesses must be ble to resolve the hostnames of the other managers and hosts in the dataservice. Installation will fail if prerequisities and host availability and stability cannot be confirmed.
Update the configuration using tpm, adding the new
host to the list of members
If using the staging method of deployment, you can use
+=
, which appends the host to
the existing deployment as shown in the example below. Click the link to switch
between staging and ini type deployment examples.
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--members+=host4 \
--witnesses=host4 \
--enable-active-witnesses=true \
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update --no-connectors
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
shell> vi /etc/tungsten/tungsten.ini
[alpha]
...
members=host1,host2,host3,host4
witnesses=host4
enable-active-witnesses=true
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update --no-connectors
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
Using the --no-connectors
option
updates the current deployment without restarting the existing
connectors.
Once installation has completed successfully, and the manager service has started on each configured active witness, the status can be determined using ls within cctrl:
[LOGICAL] /alpha > ls
COORDINATOR[host1:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[20446](ONLINE, created=0, active=0) |
|connector@host2[21698](ONLINE, created=0, active=0) |
|connector@host3[30354](ONLINE, created=0, active=0) |
+----------------------------------------------------------------------------+
DATASOURCES:
+----------------------------------------------------------------------------+
|host1(slave:ONLINE, progress=8946, latency=0.000) |
|STATUS [OK] [2018/12/05 04:27:47 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host3, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host2(slave:ONLINE, progress=8946, latency=0.334) |
|STATUS [OK] [2018/12/05 04:06:59 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host3, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host3(master:ONLINE, progress=8946, THL latency=0.331) |
|STATUS [OK] [2018/11/20 05:39:14 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=master, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
WITNESSES:
+----------------------------------------------------------------------------+
|host4(witness:ONLINE) |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
+----------------------------------------------------------------------------+
Validation of the cluster with the new witnesses can be verified by using the cluster validate command within cctrl.
This section explains the simple process for converting an Active Witness into a full cluster node. This process can be used to either convert the existig node or replace the witness with a new node.
First, place the cluster into MAINTENANCE
mode.
shell>cctrl
cctrl>set policy maintenance
Stop the software on the existing Witness node
shell> stopall
Whether you are converting this host, or adding a new host, ensure any additional pre-requisities that are needed for a full cluster node are in place, for example MySQL has been installed.
INI Install
If you are using an ini file for configuration, update the ini on all nodes (including connectors) removing the witness properties and placing the new host as part of the cluster configuration, example below. Skip to Staging Install further down for Staging steps.
Before:
[defaults] user=tungsten home-directory=/opt/continuent application-user=app_user application-password=secret application-port=3306 profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 mysql-allow-intensive-checks=true [nyc] enable-active-witnesses=true topology=clustered master=db1 members=db1,db2,db3 witnesses=db3 connectors=db1,db2,db3
After:
[defaults] user=tungsten home-directory=/opt/continuent application-user=app_user application-password=secret application-port=3306 profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 mysql-allow-intensive-checks=true [nyc] topology=clustered master=db1 members=db1,db2,db3 connectors=db1,db2,db3
Update the software on the existing cluster nodes and connector nodes (If separate). Include --no-connectors
if connectors you want to manually restart them when convenient.
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release
Either install on the new host or update on the previous Witness host:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm install
or:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release -f
Staging Install
If you are using a staging configuration, update the configuration from the staging host, example below:
shell> cd {STAGING_DIRECTORY}
./tools/tpm configure defaults \
--reset \
--user=tungsten \
--home-directory=/opt/continuent \
--application-user=app_user \
--application-password=secret \
--application-port=3306 \
--profile-script=~/.bash_profile \
--replication-user=tungsten \
--replication-password=secret \
--replication-port=13306 \
--mysql-allow-intensive-checks=true
./tools/tpm configure nyc \
--topology=clustered \
--master=db1 \
--members=db1,db2,db3 \
--connectors=db1,db2,db3
Update the software on the existing cluster nodes. Include --no-connectors
if connectors
co-exist on database nodes and you want to manually restart them when convenient.
shell>cd {STAGING_DIRECTORY}
shell>tools/tpm update --replace-release --hosts=db1,db2
Either install on the new host or update on the previous Witness host:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm install --hosts=db3
or:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release -f --hosts=db3
Once the software has been installed you now need to restore a backup of the database onto the node, or provision the database using the provided scripts. Either restore a backup, create and restore a new backup or use tprovision to restore the database on the host.
Start the software on the new node/old witness node
shell> startall
If you issued --no-connectors
during the update, restart the connectors when convenient
shell> connector restart
Check within cctrl from one of the existing database nodes to check that the status returns the exptected
output, if it does, return the cluster to AUTOMATIC
and the process is complete. If the output
is not correct, this is usually due to metadata files not updating, therefore on every node, issue the following:
shell> tungsten_reset_manager
This will clean the metadata files and stop the manager process. Once the script has completed on all nodes, restart the manager process on each node, one-by-one, starting with the Primary node first, followed by the Replicas
shell> manager start
Finally, return the cluster to AUTOMATIC
. If the reset process above was performed,
it may take a minute or two for the ls output of cctrl to update whilst the metadata files are refreshed.
This section explains the simple process for converting a full cluster node into an Active Witness.
First, place the cluster into MAINTENANCE
mode.
shell>cctrl
cctrl>set policy maintenance
Stop the software on the existing cluster node
shell> stopall
Stop MySQL on the existing cluster node (Syntax is an example and may differ in your environment)
shell> systemctl stop mysqld
INI Install
If you are using an ini file for configuration, update the ini on all nodes (including connectors) changind the reference to the node to be a witness node, example below. Skip to Staging Install further down for Staging steps.
Before:
[defaults] user=tungsten home-directory=/opt/continuent application-user=app_user application-password=secret application-port=3306 profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 mysql-allow-intensive-checks=true [nyc] topology=clustered master=db1 members=db1,db2,db3 connectors=db1,db2,db3
After:
[defaults] user=tungsten home-directory=/opt/continuent application-user=app_user application-password=secret application-port=3306 profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 mysql-allow-intensive-checks=true [nyc] enable-active-witnesses=true topology=clustered master=db1 members=db1,db2,db3 witnesses=db3 connectors=db1,db2,db3
Update the software on the existing cluster nodes and connector nodes (If separate). Include --no-connectors
if connectors you want to manually restart them when convenient.
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release
Update on the host you are converting:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release -f
Staging Install
If you are using a staging configuration, update the configuration from the staging host, example below:
shell> cd {STAGING_DIRECTORY}
./tools/tpm configure defaults \
--reset \
--user=tungsten \
--home-directory=/opt/continuent \
--application-user=app_user \
--application-password=secret \
--application-port=3306 \
--profile-script=~/.bash_profile \
--replication-user=tungsten \
--replication-password=secret \
--replication-port=13306 \
--mysql-allow-intensive-checks=true
./tools/tpm configure nyc \
--enable-active-witnesses=true \
--topology=clustered \
--master=db1 \
--members=db1,db2,db3 \
--witnesses=db3 \
--connectors=db1,db2,db3
Update the software on the existing cluster nodes. Include --no-connectors
if connectors
co-exist on database nodes and you want to manually restart them when convenient.
shell>cd {STAGING_DIRECTORY}
shell>tools/tpm update --replace-release --hosts=db1,db2
Update on the host you are converting:
shell>cd /opt/continuent/software/
shell>tungsten-clustering-7.1.4-10
tools/tpm update --replace-release -f --hosts=db3
Once the updates have been complete, you should then run the tungsten_reset_manager command on each node in the entire cluster. This will ensure the metadata is clean and reference to the node is reflected to be a witness, rather than a full cluter node. On each node, simply execute the command and follow the on screen prompts:
shell> tungsten_reset_manager
Restart the managers on the nodes you have not converted:
shell> manager start
Start the software on the node that you converted:
shell> startall
If you issued --no-connectors
during the update, restart the connectors when convenient
shell> connector restart
Check within cctrl from one of the existing database nodes to check that the status returns the exptected
output, and then return the cluster to AUTOMATIC
and the process is complete.
Adding more connectors to an existing installation allows for increased routing capacity. The new connectors will form part of the cluster and be fully aware and communicate with existing managers and datasources within the cluster.
To add more connectors to an existing deployment:
On the new host, ensure the Appendix B, Prerequisites have been followed.
Update the configuration using tpm, adding the new
host to the list of connectors
If using the staging method of deployment, you can use
+=
, which appends the host to
the existing deployment as shown in the example below. Click the link to switch
between staging and ini type deployment examples.
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--connectors+=host4 \
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update --no-connectors
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
shell> vi /etc/tungsten/tungsten.ini
[alpha]
...
connectors=host4
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update --no-connectors
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
Using the --no-connectors
option
updates the current deployment without restarting the existing
connectors.
During a period when it is safe to restart the connectors:
shell> ./tools/tpm promote-connector
The status of all the connectors can be monitored using cctrl:
[LOGICAL] /alpha > ls
COORDINATOR[host1:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[8616](ONLINE, created=0, active=0) |
|connector@host2[12381](ONLINE, created=0, active=0) |
|connector@host3[19708](ONLINE, created=0, active=0) |
|connector@host4[5085](ONLINE, created=0, active=0) |
+----------------------------------------------------------------------------+
There are two possible scenarios for converting from a single standalone cluster to a composite cluster. The two following sections will guide you through examples of each of these.
The following steps guide you through updating the configuration to include the new hosts as a new service and convert to a Composite Cluster.
For the purpose of this worked example, we have a single cluster dataservice called
east
with three nodes, defined
as db1, db2 and db3 with db1 as the Primary.
Our goal is to create a new cluster dataservice called
west
with three nodes, defined
as db4, db5 and db6 with db4 as the relay.
We will configure a new composite dataservice called
global
The steps show two alternative approaches, to create the west
as a Passive cluster (Composite Active/Passive) or to create the
west
cluster as a second active cluster (Composite Active/Active)
On the new host(s), ensure the Appendix B, Prerequisites have been followed.
If configuring via the Staging Installation method, skip straight to Step 4:
The staging method CANNOT be used if converting to an Active/Active cluster
On the new host(s), ensure the tungsten.ini
contains
the correct service blocks for both the existing cluster and the new cluster.
On the new host(s), install the proper version of clustering software, ensuring that the version being installed matches the version currently installed on the existing hosts.
shell>cd /opt/continuent/sofware
shell>tar zxvf tungsten-clustering-7.1.4-10.tar.gz
shell>cd tungsten-clustering-7.1.4-10
shell>./tools/tpm install
Ensure --start-and-report
is set to false
in the configuration for the new hosts.
Set the existing cluster to maintenance mode using cctrl:
shell>cctrl
[LOGICAL] / >set policy maintenance
Add the definition for the new cluster service
west
and composite service
global
to the existing configuration
on the existing host(s):
For Composite Active/Passive
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm configure west \ --connectors=db4,db5,db6 \ --relay-source=east \ --relay=db4 \ --slaves=db5,db6 \ --topology=clustered
shell>./tools/tpm configure global \ --composite-datasources=east,west
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update --no-connectors --replace-release
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
shell> vi /etc/tungsten/tungsten.ini
[west] ... connectors=db4,db5,db6 relay-source=east relay=db4 slaves=db5,db6 topology=clustered
[global] ... composite-datasources=east,west
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update --no-connectors --replace-release
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
For Composite Active/Active
shell> vi /etc/tungsten/tungsten.ini
[west] topology=clustered connectors=db4,db5,db6 master=db4 members=db4,db5,db6 [global] topology=composite-multi-master composite-datasources=east,west
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update --no-connectors --replace-release
Using the optional --no-connectors
option
updates the current deployment without restarting the existing
connectors.
Using the --replace-release
option
ensures the metadata files for the cluster are correctly rebuilt.
This parameter MUST be supplied.
On every node in the original EAST cluster, make sure all replicators are online:
shell>trepctl services
shell>trepctl -all-services online
On all the new hosts in the new cluster, start the manager processes ONLY
shell> manager start
From the original cluster, use cctrl to check that the new dataservice and composite dataservice have been created, and place the new dataservice into maintenance mode
shell>cctrl
cctrl>cd /
cctrl>ls
cctrl>use global
cctrl>ls
cctrl>datasource east online
cctrl>set policy maintenance
Example from a Composite Active/Passive Cluster
tungsten@db1:~ $cctrl
Tungsten Clustering 7.1.4 build 10 east: session established, encryption=false, authentication=false [LOGICAL] /east > cd / [LOGICAL] / > ls [LOGICAL] / >ls
global east west [LOGICAL] / >use global
[LOGICAL] /global >ls
COORDINATOR[db3:MIXED:ONLINE] east:COORDINATOR[db3:MAINTENANCE:ONLINE] west:COORDINATOR[db5:AUTOMATIC:ONLINE] ROUTERS: +---------------------------------------------------------------------------------+ |connector@db1[9493](ONLINE, created=0, active=0) | |connector@db2[9341](ONLINE, created=0, active=0) | |connector@db3[10675](ONLINE, created=0, active=0) | +---------------------------------------------------------------------------------+ DATASOURCES: +---------------------------------------------------------------------------------+ |east(composite master:OFFLINE) | |STATUS [OK] [2019/12/09 11:04:17 AM UTC] | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |west(composite slave:OFFLINE) | |STATUS [OK] [2019/12/09 11:04:17 AM UTC] | +---------------------------------------------------------------------------------+ REASON FOR MAINTENANCE MODE: MANUAL OPERATION [LOGICAL] /global >datasource east online
composite data source 'east@global' is now ONLINE [LOGICAL] /global >set policy maintenance
policy mode is now MAINTENANCE
Example from a Composite Active/Active Cluster
tungsten@db1:~ $cctrl
Tungsten Clustering 7.1.4 build 10 east: session established, encryption=false, authentication=false [LOGICAL] /east > cd / [LOGICAL] / > ls [LOGICAL] / >ls
global east east_from_west west west_from_east [LOGICAL] / >use global
[LOGICAL] /global >ls
COORDINATOR[db3:MIXED:ONLINE] east:COORDINATOR[db3:MAINTENANCE:ONLINE] west:COORDINATOR[db4:AUTOMATIC:ONLINE] ROUTERS: +---------------------------------------------------------------------------------+ |connector@db1[23431](ONLINE, created=0, active=0) | |connector@db2[25535](ONLINE, created=0, active=0) | |connector@db3[15353](ONLINE, created=0, active=0) | +---------------------------------------------------------------------------------+ DATASOURCES: +---------------------------------------------------------------------------------+ |east(composite master:OFFLINE, global progress=10, max latency=1.043) | |STATUS [OK] [2024/08/13 11:05:01 AM UTC] | +---------------------------------------------------------------------------------+ | east(master:ONLINE, progress=10, max latency=1.043) | | east_from_west(UNKNOWN:UNKNOWN, progress=-1, max latency=-1.000) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |west(composite master:ONLINE, global progress=-1, max latency=-1.000) | |STATUS [OK] [2024/08/13 11:07:56 AM UTC] | +---------------------------------------------------------------------------------+ | west(UNKNOWN:UNKNOWN, progress=-1, max latency=-1.000) | | west_from_east(UNKNOWN:UNKNOWN, progress=-1, max latency=-1.000) | +---------------------------------------------------------------------------------+ REASON FOR MAINTENANCE MODE: MANUAL OPERATION [LOGICAL] /global >datasource east online
composite data source 'east@global' is now ONLINE [LOGICAL] /global >set policy maintenance
policy mode is now MAINTENANCE
Start the replicators in the new cluster ensuring they start as OFFLINE:
shell> replicator start offline
Go to the relay (or Primary) node of the new cluster (i.e. db4) and provision it from a Replica of the original cluster (i.e. db2):
Provision the new relay in a Composite Active/Passive Cluster
db4-shell> tprovision -s db2
Provision the new primary in a Composite Active/Active Cluster
db4-shell> tprovision -s db2 -c
Go to each Replica node of the new cluster and provision from the relay node of the new cluster (i.e. db4):
db5-shell> tprovision -s db4
Bring the replicators in the new cluster online, if not already:
shell> trepctl -all-services online
From a node in the original cluster (e.g. db1), using cctrl, set the composite cluster online, if not already, and return to automatic:
shell>cctrl
[LOGICAL] / >use global
[LOGICAL] / >datasource west online
[LOGICAL] / >set policy automatic
Start the connectors associated with
the new cluster hosts in west
:
shell> connector start
Depending on the mode in which the connectors are running, you may need to configure the
user.map
. If this is in use on the old cluster, then we recommend
that you take a copy of this file and place this on the new connectors associated with the
new cluster, and then adjust any affinity settings that are required. Additionally, the
user.map
may need adjustments on the original cluster. For more details
on the user.map
file, it is advised to review the relevant sections in
the Connector documentation related to the mode your connectors are operating in.
These can be found at Section 7.6.1, “user.map
File Format”
If --no-connectors
was issued during the update, then
during a period when it is safe, restart the connectors associated with
the original cluster:
shell> ./tools/tpm promote-connector
This method of conversion is a little more complicated and the only safe way to accomplish this would require downtime for the replication on all nodes.
To achieve this without downtime to your applications, it is recommended that all application activity be isolated to the Primary host only. Following the conversion, all activity will then be replicated to the Replica nodes
Our example starting cluster has 5 nodes (1 Primary and 4 Replicas) and uses
service name alpha
. Our target cluster will have 6
nodes (3 per cluster) in 2 member clusters alpha_east
and alpha_west
in composite service
alpha
.
This means that we will reuse the existing service name
alpha
as the name of the new composite service, and
create two new service names, one for each cluster
(alpha_east
and alpha_west
).
To convert the above configuration, follow the steps below:
On the new host, ensure the Appendix B, Prerequisites have been followed.
Ensure the cluster is in MAINTENANCE mode. This will prevent the managers from performing any unexpected recovery or failovers during the process.
cctrl> set policy maintenance
Next, you must stop all services on all existing nodes.
shell> stopall
If configuring via the INI Installation Method, update tungsten.ini on all original 5 nodes, then copy the file to the new node.
You will need to create two new services for each cluster, and change the original service stanza to represent the composite service. An example of how the complete configuration would look is below. Click the link the switch between ini and staging configurations.
shell>./tools/tpm configure defaults \ --reset \ --user=tungsten \ --install-directory=/opt/continuent \ --profile-script=~/.bash_profile \ --replication-user=tungsten \ --replication-password=secret \ --replication-port=13306 \ --application-user=app_user \ --application-password=secret \ --application-port=3306 \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure alpha_east \ --topology=clustered \ --master=db1 \ --members=db1,db2,db3 \ --connectors=db1,db2,db3
shell>./tools/tpm configure alpha_west \ --topology=clustered \ --relay=db4 \ --members=db4,db5,db6 \ --connectors=db4,db5,db6 \ --relay-source=alpha_east
shell>./tools/tpm configure alpha \ --composite-datasources=alpha_east,alpha_west
shell> vi /etc/tungsten/tungsten.ini
[defaults] user=tungsten install-directory=/opt/continuent profile-script=~/.bash_profile replication-user=tungsten replication-password=secret replication-port=13306 application-user=app_user application-password=secret application-port=3306 rest-api-admin-user=apiuser rest-api-admin-pass=secret
[alpha_east] topology=clustered master=db1 members=db1,db2,db3 connectors=db1,db2,db3
[alpha_west] topology=clustered relay=db4 members=db4,db5,db6 connectors=db4,db5,db6 relay-source=alpha_east
[alpha] composite-datasources=alpha_east,alpha_west
Using you preferred backup/restore method, take a backup of the MySQL database on one of the original nodes and restore this to the new node
If preferred, this step can be skipped, and the provision of the new node completed via the use of the supplied provisioning scripts, explained in Step 10 below.
Invoke the conversion using the tpm command from the software extraction directory.
If installation configured via the INI method, this command should be run on all 5 original nodes. If configured via Staging method, this command should be run on the staging host only.
shell>tpm query staging
shell>cd {software_staging_dir_from_tpm_query}
shell>./tools/tpm update --replace-release --force
shell>rm /opt/continuent/tungsten/cluster-home/conf/cluster/*/datasource/*
The use of the --force
option is required
to force the override of the old properties
Only if installation configured via the INI method, then proceed to install the software using the tpm command from the software extraction directory on the new node:
shell>cd {software_staging_dir}
shell>./tools/tpm install
Ensure you install the same version of software on the new node that matches exactly, the version on the existing 5 nodes
Start all services on all existing nodes.
shell> startall
Bring the clusters back into AUTOMATIC mode:
shell>cctrl -multi
cctrl>use alpha
cctrl>set policy automatic
cctrl>exit
If you skipped the backup/restore step above, you now need to provision the database on the new node. To do this, use the tungsten_provision_slave script to provision the database from one of the existing nodes, for example db5
shell> tungsten_provision_slave --source db5
If you have an existing dataservice, data can be replicated from a standalone MySQL server into the service. The replication is configured by creating a service that reads from the standalone MySQL server and writes into the cluster through a connector attached to your dataservice. By writing through the connector, changes to the underlying dataservice topology can be handled.
Additionally, using a replicator that writes data into an existing data service can be used when migrating from an existing service into a new Tungsten Cluster service.
For more information on initially provisioning the data for this type of operation, see Section 6.12.2, “Migrating from MySQL Native Replication Using a New Service”.
In order to configure this deployment, there are two steps:
Create a new replicator on the source server that extracts the data.
Create a new replicator that reads the binary logs directly from the external MySQL service through the connector
There are also the following requirements:
The host on which you want to replicate to must have Tungsten Replicator 5.3.0 or later.
Hosts on both the replicator and cluster must be able to communicate with each other.
The replication user on the source host must have the
RELOAD
,
REPLICATION SLAVE
, and
REPLICATION CLIENT
GRANT
privileges.
Replicator must be able to connect as the
tungsten
user to the databases
within the cluster.
When writing into the Primary through the connector, the user must be
given the correct privileges to write and update the MySQL server. For
this reason, the easiest method is to use the
tungsten
user, and ensure that
that user has been added to the user.map
:
tungsten secret alpha
Install the Tungsten Replicator package (see
Section 2.3.2, “Using the RPM package files”), or download the compressed
tarball and unpack it on host1
:
shell>cd /opt/replicator/software
shell>tar zxf tungsten-replicator-
7.1.4-10
.tar.gz
Change to the Tungsten Replicator staging directory:
shell> cd tungsten-replicator-7.1.4-10
Configure the replicator on host1
First we configure the defaults and a cluster alias that points to the Primaries and Replicas within the current Tungsten Cluster service that you are replicating from:
Click the link below to switch examples between Staging and INI methods
shell> ./tools/tpm configure alpha \
--master=host1 \
--install-directory=/opt/continuent \
--replication-user=tungsten \
--replication-password=password \
--enable-batch-service=true
shell> vi /etc/tungsten/tungsten.ini
[alpha]
master=host1
install-directory=/opt/continuent
replication-user=tungsten
replication-password=password
enable-batch-service=true
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
The hostname of the primary (extractor) within the current service.
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
--replication-password=password
The password to be used when connecting to the database using
the corresponding
--replication-user
.
This option enables batch mode for a service, which ensures that replication services that are writing to a target database using batch mode in heterogeneous deployments (for example Hadoop, Amazon Redshift or Vertica). Setting this option enables the following settings on each host:
On a Primary
mysql-use-bytes-for-string
is set to false.
colnames
filter is
enabled (in the
binlog-to-q
stage
to add column names to the THL information.
pkey
filter is
enabled (in the
binlog-to-q
and
q-to-dbms
stage),
with the
addPkeyToInserts
and
addColumnsToDeletes
filter options set to true. This ensures that rows have
the right primary key information.
enumtostring
filter is enabled (in the
q-to-thl
stage), to
translate ENUM
values to their string equivalents.
settostring
filter
is enabled (in the
q-to-thl
stage), to
translate SET
values to their string equivalents.
On a Replica
mysql-use-bytes-for-string
is set to true.
This creates a configuration that specifies that the topology should read
directly from the source host, host3
,
writing directly to host1
. An
alternative THL port is provided to ensure that the THL listener is not
operating on the same network port as the original.
Now install the service, which will create the replicator reading direct
from host3
into
host1
:
shell> ./tools/tpm install
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file for
more information about the root cause.
Once the installation has been completed, you must update the position of the replicator so that it points to the correct position within the source database to prevent errors during replication. If the replication is being created as part of a migration process, determine the position of the binary log from the external replicator service used when the backup was taken. For example:
mysql> show master status;
*************************** 1. row ***************************
File: mysql-bin.000026
Position: 1311
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
Use dsctl set to update the replicator position to point to the Primary log position:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/dsctl -service beta set \
-reset -seqno 0 -epoch 0 \
-source-id host3 -event-id mysql-bin.000026:1311
Now start the replicator:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/replicator start
Replication status should be checked by explicitly using the servicename and/or RMI port:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/trepctl -service beta status
Processing status command...
NAME VALUE
---- -----
appliedLastEventId : mysql-bin.000026:0000000000001311;1252
appliedLastSeqno : 5
appliedLatency : 0.748
channels : 1
clusterName : beta
currentEventId : mysql-bin.000026:0000000000001311
currentTimeMillis : 1390410611881
dataServerHost : host1
extensions :
host : host3
latestEpochNumber : 1
masterConnectUri : thl://host3:2112/
masterListenUri : thl://host1:2113/
maximumStoredSeqNo : 5
minimumStoredSeqNo : 0
offlineRequests : NONE
pendingError : NONE
pendingErrorCode : NONE
pendingErrorEventId : NONE
pendingErrorSeqno : -1
pendingExceptionMessage: NONE
pipelineSource : jdbc:mysql:thin://host3:13306/
relativeLatency : 8408.881
resourcePrecedence : 99
rmiPort : 10000
role : master
seqnoType : java.lang.Long
serviceName : beta
serviceType : local
simpleServiceName : beta
siteName : default
sourceId : host3
state : ONLINE
timeInStateSeconds : 8408.21
transitioningTo :
uptimeSeconds : 8409.88
useSSLConnection : false
version : Tungsten Replicator 7.1.4 build 10
Finished status command...
If you have an existing cluster and you want to replicate the data out to a separate standalone server using Tungsten Replicator then you can create a cluster alias, and use a Primary/Replica topology to replicate from the cluster. This allows for THL events from the cluster to be applied to a separate server for the purposes of backup or separate analysis.
During the installation process a cluster-alias
and
cluster-slave
are declared. The cluster-alias
describes all of the servers in the cluster and how they may be reached.
The cluster-slave
defines one or more servers that
will replicate from the cluster.
The Tungsten Replicator will be installed on the Cluster-Extractor server. That server will download THL data and apply them to the local server. If the Cluster-Extractor has more than one server; one of them will be declared the relay (or Primary). The other members of the Cluster-Extractor may also download THL data from that server.
If the relay for the Cluster-Extractor fails; the other nodes will automatically start downloading THL data from a server in the cluster. If a non-relay server fails; it will not have any impact on the other members.
Identify the cluster to replicate from. You will need the Primary, Replicas and THL port (if specified). Use tpm reverse from a cluster member to find the correct values.
If you are replicating to a non-MySQL server. Update the configuration of the cluster to include the following properties prior to beginning.
svc-extractor-filters=colnames,pkey property=replicator.filter.pkey.addColumnsToDeletes=true property=replicator.filter.pkey.addPkeyToInserts=true
Identify all servers that will replicate from the cluster. If there is more than one, a relay server should be identified to replicate from the cluster and provide THL data to other servers.
Prepare each server according to the prerequisites for the DBMS platform it is serving. If you are working with multiple DBMS platforms; treat each platform as a different Cluster-Extractor during deployment.
Make sure the THL port for the cluster is open between all servers.
Install the Tungsten Replicator package or download the Tungsten Replicator tarball, and unpack it:
shell>cd /opt/continuent/software
shell>tar zxf
tungsten-replicator-7.1.4-10.tar.gz
Change to the unpackaged directory:
shell> cd tungsten-replicator-7.1.4-10
Configure the replicator
Click the link below to switch examples between Staging and INI methods
shell>./tools/tpm configure defaults \ --install-directory=/opt/continuent \ --profile-script=~/.bash_profile \ --replication-password=secret \ --replication-port=13306 \ --replication-user=tungsten \ --user=tungsten \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure alpha \ --master=host1 \ --slaves=host2,host3 \ --thl-port=2112 \ --topology=cluster-alias
shell>./tools/tpm configure beta \ --relay=host6 \ --relay-source=alpha \ --topology=cluster-slave
shell> vi /etc/tungsten/tungsten.ini
[defaults] install-directory=/opt/continuent profile-script=~/.bash_profile replication-password=secret replication-port=13306 replication-user=tungsten user=tungsten rest-api-admin-user=apiuser rest-api-admin-pass=secret
[alpha] master=host1 slaves=host2,host3 thl-port=2112 topology=cluster-alias
[beta] relay=host6 relay-source=alpha topology=cluster-slave
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
--profile-script=~/.bash_profile
profile-script=~/.bash_profile
Append commands to include env.sh in this profile script
The password to be used when connecting to the database using
the corresponding
--replication-user
.
The network port used to connect to the database server. The default port used depends on the database being configured.
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
System User
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
The hostname of the primary (extractor) within the current service.
What are the Replicas for this dataservice?
Port to use for THL Operations
Replication topology for the dataservice.
Configuration group beta
The description of each of the options is shown below; click the icon to hide this detail:
The hostname of the primary (extractor) within the current service.
Dataservice name to use as a relay source
Replication topology for the dataservice.
If you are replicating to a non-MySQL server. Include the following steps in your configuration.
shell>mkdir -p /opt/continuent/share/
shell>cp tungsten-replicator/support/filters-config/convertstringfrommysql.json » /opt/continuent/share/
Then, include the following parameters in the configuration
property=replicator.stage.remote-to-thl.filters=convertstringfrommysql
property=replicator.filter.convertstringfrommysql.definitionsFile= »
/opt/continuent/share/convertstringfrommysql.json
This dataservice cluster-alias
name MUST be the
same as the cluster dataservice name that you are replicating from.
Do not include
start-and-report=true
if you are
taking over for MySQL native replication. See
Section 6.12.1, “Migrating from MySQL Native Replication 'In-Place'” for next
steps after completing installation.
Once the configuration has been completed, you can perform the installation to set up the services using this configuration:
shell> ./tools/tpm install
During the installation and startup, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file for
more information about the root cause.
The cluster should be installed and ready to use.
You can replicate data from an existing cluster to a datawarehouse such as Hadoop or Vertica. A replication applier node handles the datawarehouse loading by obtaining THL from the cluster. The configuration of the cluster needs to be changed to be compatible with the required target applier format.
The Cluster-Extractor deployment works by configuring the cluster replication service in heterogeneous mode, and then replicating out to the Appliers that writes into the datawarehouse by using a cluster alias. This ensures that changes to the cluster topology (i.e. Primary switches during a failover or maintenance) still allow replication to continue effectively to your chosen datawarehouse.
The datawarehouse may be installed and running on the same host as the replicator, "Onboard", or on a different host entirely, "Offboard".
Below is a summary of the steps needed to configure the Cluster-Extractor topology, with links to the actual procedures included:
Install or update a cluster, configured to operate in heterogeneous mode.
In our example, the cluster configuration file
/etc/tungsten/tungsten.ini
would contain two
stanzas:
[defaults]
- contains configuration values used
by all services.
[alpha]
- contains cluster configuration
parameters, and will use
topology=clustered
to indicate to
the tpm command that nodes listed in this stanza
are to be acted upon during installation and update operations.
For more details about installing the source cluster, please see Section 3.10.2, “Replicating from a Cluster to a Datawarehouse - Configuring the Cluster Nodes”.
Potentially seed the initial data. For more information about various ways to provision the initial data into the target warehouse, please see Section 3.11, “Migrating and Seeding Data”.
Install the Extractor replicator:
In our example, the Extractor configuration file
/etc/tungsten/tungsten.ini
would contain three
stanzas:
[defaults]
- contains configuration values used
by all services.
[alpha]
- contains the list of cluster nodes for
use by the applier service as a source list. This stanza will
use topology=cluster-alias
to
ensure that no installation or update action will ever be taken on
the listed nodes by the tpm command.
[omega]
- defines a replicator Applier
service that uses
topology=cluster-slave
. This
service will extract THL from the cluster nodes defined in the relay
source cluster-alias definition [alpha]
and write
the events into your chosen datawarehouse.
For more details about installing the replicator, please see Section 3.10.3, “Replicating from a Cluster to a Datawarehouse - Configuring the Cluster-Extractor”.
There are the prerequisite requirements for Cluster-Extractor operations::
The Tungsten Cluster and Tungsten Replicator must be version 5.2.0 or later.
Hosts on both the replicator and cluster must be able to communicate with each other.
Replicator must be able to connect as the
tungsten
user to the databases
within the cluster
There are the steps to configure a cluster to act as the source for a Cluster-Extractor replicator writing into a datawarehouse:
Enable MySQL ROW-based Binary Logging
All MySQL databases running in clusters replicating to non-MySQL targets must operate in ROW-based replication mode to prevent data drift.
This is required because replication to the datawarehouse environment must send the raw-data, rather than the statements which cannot be applied directly to a target datawarehouse.
You must configure the my.cnf
file to enable
ROW-based binary logging:
binlog-format = ROW
ROW-based binary logging can also be enabled without restarting the MySQL server:
mysql>select @@global.binlog_format\G
*************************** 1. row *************************** @@global.binlog_format: MIXED 1 row in set (0.00 sec) mysql>SET GLOBAL binlog_format = 'ROW';
Query OK, 0 rows affected (0.00 sec) mysql>select @@global.binlog_format\G
*************************** 1. row *************************** @@global.binlog_format: ROW 1 row in set (0.00 sec)
Enable and Configure the Extractor Filters
Heterogeneous mode should be enabled within the cluster.
The extractor filters and two associated properties add the column names and primary key details to the THL. This is required so that the information can be replicated into the datawarehouse correctly.
For example, on every cluster node the lines below would be added to
the /etc/tungsten/tungsten.ini
file, then
tpm update would be executed:
[alpha]
...
repl-svc-extractor-filters=colnames,pkey
property=replicator.filter.pkey.addColumnsToDeletes=true
property=replicator.filter.pkey.addPkeyToInserts=true
For staging deployments, prepend two hyphens to each line and include on the command line.
Configure the replicator that will act as an Extractor, reading information from the cluster and then applying that data into the chosen datawarehouse. Multiple example targets are shown.
This node may be located either on a separate host (for example when replicating to Amazon Redshift), or on the same node as the target datawarehouse service (i.e. HP Vertica or Hadoop).
On the following pages are the steps to configure a Cluster-Extractor target replicator writing into a datawarehouse for both staging and INI methods of installation.
The following Staging-method procedure will install the
Tungsten Replicator software onto target node host6
,
extracting from a cluster consisting of three (3) nodes
(host1
, host2
and
host3
) and applying into the target datawarehouse via
host6
.
If you are replicating to a MySQL-specific target, please see Section 3.9, “Replicating Data Out of a Cluster” for more information.
On your staging server, go to the software directory.
shell> cd /opt/continuent/software
Download the latest Tungsten Replicator version.
Unpack the release package
shell> tar xvzf tungsten-replicator-7.1.4-10.tar.gz
Change to the unpackaged directory:
shell> cd tungsten-replicator-7.1.4-10.tar.gz
Execute the tpm command to configure defaults for the installation.
shell> ./tools/tpm configure defaults \
--install-directory=/opt/replicator \
'--profile-script=~/.bashrc' \
--replication-password=secret \
--replication-port=13306 \
--replication-user=tungsten \
--start-and-report=true \
--mysql-allow-intensive-checks=true \
--user=tungsten
The description of each of the options is shown below; click the icon to hide this detail:
This runs the tpm command. configure defaults indicates that we are setting options which will apply to all dataservices.
--install-directory=/opt/replicator
The installation directory of the Tungsten service. This is where the service will be installed on each server in your dataservice.
The profile script used when your shell starts. Using this line modifies your profile script to add a path to the Tungsten tools so that managing Tungsten Cluster™ are easier to use.
The operating system user name that you have created for the
Tungsten service,
tungsten
.
The user name that will be used to apply replication changes to the database on Replicas.
--replication-password=password
The password that will be used to apply replication changes to the database on Replicas.
Set the port number to use when connecting to the MySQL server.
Tells tpm to startup the service, and report the current configuration and status.
Configure a cluster alias that points to the Primaries and Replicas within the current Tungsten Cluster service that you are replicating from:
shell> ./tools/tpm configure alpha \
--master=host1 \
--slaves=host2,host3 \
--thl-port=2112 \
--topology=cluster-alias
The description of each of the options is shown below; click the icon to hide this detail:
This runs the tpm command.
configure indicates that we
are creating a new dataservice, and
alpha
is the name of the
dataservice being created.
This definition is for a dataservice alias, not an actual
dataservice because
--topology=cluster-alias
has
been specified. This alias is used in the cluster-slave section
to define the source hosts for replication.
Specifies the hostname of the default Primary in the cluster.
Specifies the name of any other servers in the cluster that may be replicated from.
The THL port for the cluster. The default value is 2112 but any other value must be specified.
Define this as a cluster dataservice alias so tpm does not try to install cluster software to the hosts.
This dataservice
cluster-alias
name MUST be
the same as the cluster dataservice name that you are replicating
from.
On the Cluster-Extractor node, copy the
convertstringfrommysql.json
filter
configuration sample file into the
/opt/replicator/share
directory then edit it to
suit:
cp /opt/replicator/tungsten/tungsten-replicator/support/filters-config/convertstringfrommysql.json /opt/replicator/share/
vi /opt/replicator/share/convertstringfrommysql.json
Once the
convertstringfrommysql
JSON
configuration file has been edited, update the
/etc/tungsten/tungsten.ini
file to add and
configure any addition options needed for the specific datawarehouse
you are using.
Create the configuration that will replicate from cluster
dataservice alpha
into the
database on the host specified by
--relay=host6
:
shell> ./tools/tpm configure omega \
--relay=host6 \
--relay-source=alpha \
--repl-svc-remote-filters=convertstringfrommysql \
--property=replicator.filter.convertstringfrommysql.definitionsFile=/opt/replicator/share/convertstringfrommysql.json \
--topology=cluster-slave
The description of each of the options is shown below; click the icon to hide this detail:
This runs the tpm command.
configure indicates that we
are creating a new replication service, and
omega
is the unique
service name for the replication stream from the cluster.
Specifies the hostname of the destination database into which data will be replicated.
Specifies the name of the source cluster dataservice alias (defined above) that will be used to read events to be replicated.
Read source replication data from any host in the
alpha
dataservice.
Now finish configuring the
omega
dataservice with the
options specific to the datawarehouse target in use.
AWS RedShift Target
shell> ./tools/tpm configure omega \
--batch-enabled=true \
--batch-load-template=redshift \
--enable-heterogeneous-slave=true \
--datasource-type=redshift \
--replication-host=REDSHIFT_ENDPOINT_FQDN_HERE \
--replication-user=REDSHIFT_PASSWORD_HERE \
--replication-password=REDSHIFT_PASSWORD_HERE \
--redshift-dbname=REDSHIFT_DB_NAME_HERE \
--svc-applier-filters=dropstatementdata \
--svc-applier-block-commit-interval=10s \
--svc-applier-block-commit-size=5
The description of each of the options is shown below; click the icon to hide this detail:
Configures default options that will be configured for all future services.
Configure the topology as a cluster-slave. This will configure the individual replicator as ac Extractor of all the nodes in the cluster, as defined in the previous configuration of the cluster topology.
Configure the node as the relay for the cluster which will replicate data into the datawarehouse.
Configures the Extractor to correctly process the incoming data so that it can be written to the datawarehouse. This includes correcting the processing of text data types and configuring the appropriate filters.
The target host for writing data. In the case of Redshift, this is the fully qualified hostname of the Redshift host.
The user within the Redshift service that will be used to write data into the database.
--replication-password=password
The password for the user within the Redshift service that will be used to write data into the database.
Set the datasource type to be used when storing information about the replication state.
Enable the batch service, this configures the JavaScript batch engine and CSV writing semantics to generate the data to be applied into a datawarehouse.
--batch-load-template=redshift
The batch load template to be used. Since we are replicating
into Redshift, the
redshift
template is used.
The name of the database within the Redshift service where the data will be written.
Please see Install Amazon Redshift Applier for more information.
Vertica Target
shell> ./tools/tpm configure omega \
--batch-enabled=true \
--batch-load-template=vertica6 \
--batch-load-language=js \
--datasource-type=vertica \
--disable-relay-logs=true \
--enable-heterogeneous-service=true \
--replication-user=dbadmin \
--replication-password=VERTICA_DB_PASSWORD_HERE \
--replication-host=VERTICA_HOST_NAME_HERE \
--replication-port=5433 \
--svc-applier-block-commit-interval=5s \
--svc-applier-block-commit-size=500 \
--vertica-dbname=VERTICA_DB_NAME_HERE
Please see Install Vertica Applier for more information.
For additional targets, please see the full list at Deploying Appliers, or click on some of the targets below:
Once the configuration has been completed, you can perform the installation to set up the Tungsten Replicator services using the tpm command run from the staging directory:
shell> ./tools/tpm install
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file
for more information about the root cause.
The Cluster-Extractor replicator should now be installed and ready to use.
The following INI-based procedure will install the Tungsten Replicator
software onto target node host6
, extracting from a
cluster consisting of three (3) nodes (host1
,
host2
and host3
) and applying into
the target datawarehouse via host6
.
If you are replicating to a MySQL-specific target, please see Deploying the MySQL Applier for more information.
On the Cluster-Extractor node, copy the
convertstringfrommysql.json
filter
configuration sample file into the
/opt/replicator/share
directory then edit it to
suit:
cp /opt/replicator/tungsten/tungsten-replicator/support/filters-config/convertstringfrommysql.json /opt/replicator/share/
vi /opt/replicator/share/convertstringfrommysql.json
Once the
convertstringfrommysql
JSON
configuration file has been edited, update the
/etc/tungsten/tungsten.ini
file to add and
configure any addition options needed for the specific datawarehouse
you are using.
Create the configuration file
/etc/tungsten/tungsten.ini
on the destination
DBMS host, i.e. host6
:
[defaults]
user=tungsten
install-directory=/opt/replicator
replication-user=tungsten
replication-password=secret
replication-port=3306
profile-script=~/.bashrc
mysql-allow-intensive-checks=true
start-and-report=true
[alpha]
topology=cluster-alias
master=host1
members=host1,host2,host3
thl-port=2112
[omega]
topology=cluster-slave
relay=host6
relay-source=alpha
repl-svc-remote-filters=convertstringfrommysql
property=replicator.filter.convertstringfrommysql.definitionsFile=/opt/replicator/share/convertstringfrommysql.json
The description of each of the options is shown below; click the icon to hide this detail:
[defaults]
defaults
indicates that we are
setting options which will apply to all cluster dataservices.
The operating system user name that you have created for the
Tungsten service,
tungsten
.
install-directory=/opt/replicator
The installation directory of the Tungsten Replicator service. This is where the replicator software will be installed on the destination DBMS server.
The MySQL user name to use when connecting to the MySQL database.
The MySQL password for the user that will connect to the MySQL database.
The TCP/IP port on the destination DBMS server that is listening for connections.
Tells tpm to startup the service, and report the current configuration and status.
Tells tpm to add PATH information to the specified script to initialize the Tungsten Replicator environment.
[alpha]
alpha
is the name and
identity of the source cluster alias being created.
This definition is for a dataservice alias, not an actual
dataservice because
topology=cluster-alias
has been
specified. This alias is used in the cluster-slave section to
define the source hosts for replication.
Define this as a cluster dataservice alias so tpm does not try to install cluster software to the hosts.
A comma separated list of all the hosts that are part of this cluster dataservice.
The hostname of the server that is the current cluster Primary MySQL server.
The THL port for the cluster. The default value is 2112 but any other value must be specified.
[omega]
omega
is is the unique
service name for the replication stream from the cluster.
This replication service will extract data from cluster
dataservice alpha
and
apply into the database on the DBMS server specified by
relay=host6
.
Tells tpm this is a Cluster-Extractor replication service which will have a list of all source cluster nodes available.
The hostname of the destination DBMS server.
Specifies the name of the source cluster dataservice alias (defined above) that will be used to read events to be replicated.
The cluster-alias
name (i.e.
alpha
) MUST be the same as
the cluster dataservice name that you are replicating from.
Do not include
start-and-report=true
if you are
taking over for MySQL native replication. See
Section 6.12.1, “Migrating from MySQL Native Replication 'In-Place'” for
next steps after completing installation.
Now finish configuring the
omega
dataservice with the
options specific to the datawarehouse target in use.
Append the appropriate code snippet below to the bottom of the
existing [omega]
stanza:
AWS RedShift Target - Offboard Batch Applier
batch-enabled=true
batch-load-template=redshift
datasource-type=redshift
enable-heterogeneous-slave=true
replication-host=REDSHIFT_ENDPOINT_FQDN_HERE
replication-user=REDSHIFT_PASSWORD_HERE
replication-password=REDSHIFT_PASSWORD_HERE
redshift-dbname=REDSHIFT_DB_NAME_HERE
svc-applier-filters=dropstatementdata
svc-applier-block-commit-interval=1m
svc-applier-block-commit-size=5000
The description of each of the options is shown below; click the icon to hide this detail:
Configure the topology as a Cluster-Extractor. This will configure the individual replicator as an trext; of all the nodes in the cluster, as defined in the previous configuration of the cluster topology.
Configure the node as the relay for the cluster which will replicate data into the datawarehouse.
--enable-heterogeneous-slave=true
Configures the Extractor to correctly process the incoming data so that it can be written to the datawarehouse. This includes correcting the processing of text data types and configuring the appropriate filters.
The target host for writing data. In the case of Redshift, this is the fully qualified hostname of the Redshift host.
The user within the Redshift service that will be used to write data into the database.
--replication-password=password
The password for the user within the Redshift service that will be used to write data into the database.
Set the datasource type to be used when storing information about the replication state.
Enable the batch service, this configures the JavaScript batch engine and CSV writing semantics to generate the data to be applied into a datawarehouse.
--batch-load-template=redshift
The batch load template to be used. Since we are replicating
into Redshift, the
redshift
template is used.
The name of the database within the Redshift service where the data will be written.
Please see Install Amazon Redshift Applier for more information.
Vertica Target - Onboard/Offboard Batch Applier
batch-enabled=true
batch-load-template=vertica6
batch-load-language=js
datasource-type=vertica
disable-relay-logs=true
enable-heterogeneous-service=true
replication-user=dbadmin
replication-password=VERTICA_DB_PASSWORD_HERE
replication-host=VERTICA_HOST_NAME_HERE
replication-port=5433
svc-applier-block-commit-interval=5s
svc-applier-block-commit-size=500
vertica-dbname=VERTICA_DB_NAME_HERE
Please see Install Vertica Applier for more information.
For additional targets, please see the full list at Deploying Appliers, or click on some of the targets below:
Download and install the latest Tungsten Replicator package
(.rpm
), or download the
compressed tarball and unpack it on host6
:
shell>cd /opt/continuent/software
shell>tar xvzf
tungsten-replicator-7.1.4-10.tar.gz
Change to the Tungsten Replicator staging directory:
shell> cd tungsten-replicator-7.1.4-10
Run tpm to install the Tungsten Replicator software with the INI-based configuration:
shell > ./tools/tpm install
During the installation and startup, tpm will notify you of any problems that need to be fixed before the service can be correctly installed and started. If the service starts correctly, you should see the configuration and current status of the service.
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file
for more information about the root cause.
The Cluster-Extractor replicator should now be installed and ready to use.
If you are migrating an existing MySQL native replication deployment to use Tungsten Cluster the configuration of the Tungsten Cluster replication must be updated to match the status of the Replica.
Deploy Tungsten Cluster using the model or system appropriate according
to Chapter 2, Deployment. Ensure that the Tungsten Cluster is not
started automatically by excluding the
--start
or
--start-and-report
options from the
tpm commands.
On each Replica
Confirm that native replication is working on all Replica nodes :
shell> echo 'SHOW SLAVE STATUS\G' | tpm mysql | \
egrep 'Master_Host| Last_Error| Slave_SQL_Running'
Master_Host: tr-ssl1
Slave_SQL_Running: Yes
Last_Error:
On the Primary and each Replica
Reset the Tungsten Replicator position on all servers :
shell>replicator start offline
shell>trepctl -service
alpha
reset -all -y
On the Primary
Login and start Tungsten Cluster services and put the Tungsten Replicator online:
shell>startall
shell>trepctl online
On the Primary
Put the cluster into maintenance mode using cctrl to prevent Tungsten Cluster automatically reconfiguring services:
cctrl > set policy maintenance
On each Replica
Record the current Replica log position (as reported by the
Relay_Master_Log_File
and
Exec_Master_Log_Pos
output
from SHOW SLAVE STATUS
.
Ideally, each Replica should be stopped at the same position:
shell> echo 'SHOW SLAVE STATUS\G' | tpm mysql | \
egrep 'Master_Host| Last_Error| Relay_Master_Log_File| Exec_Master_Log_Pos'
Master_Host: tr-ssl1
Relay_Master_Log_File: mysql-bin.000025
Last_Error: Error executing row event: 'Table 'tungsten_alpha.heartbeat' doesn't exist'
Exec_Master_Log_Pos: 181268
If you have multiple Replicas configured to read from this Primary, record the Replica position individually for each host. Once you have the information for all the hosts, determine the earliest log file and log position across all the Replicas, as this information will be needed when starting Tungsten Cluster replication. If one of the servers does not show an error, it may be replicating from an intermediate server. If so, you can proceed normally and assume this server stopped at the same position as the host is replicating from.
On the Primary
Take the replicator offline and clear the THL:
shell>trepctl offline
shell>trepctl -service
alpha
reset -all -y
On the Primary
Start replication, using the lowest binary log file and log position from the Replica information determined in step 6.
shell> trepctl online -from-event 000025:181268
Tungsten Replicator will start reading the MySQL binary log from this position, creating the corresponding THL event data.
On each Replica
Disable native replication to prevent native replication being accidentally started on the Replica.
On MySQL 5.0 or MySQL 5.1:
shell> echo "STOP SLAVE; CHANGE MASTER TO MASTER_HOST='';" | tpm mysql
On MySQL 5.5 or later:
shell> echo "STOP SLAVE; RESET SLAVE ALL;" | tpm mysql
If the final position of MySQL replication matches the lowest across all Replicas, start Tungsten Cluster services :
shell>trepctl online
shell>startall
The Replica will start reading from the binary log position configured on the Primary.
If the position on this Replica is different, use trepctl online -from-event to set the online position according to the recorded position when native MySQL was disabled. Then start all remaining services with startall.
shell>trepctl online -from-event
shell>000025:188249
startall
Use cctrl to confirm that replication is operating correctly across the dataservice on all hosts.
Put the cluster back into automatic mode:
cctrl> set policy automatic
Update your applications to use the installed connector services rather than a direct connection.
Remove the master.info
file on
each Replica to ensure that when a Replica restarts, it does not connect
up to the Primary MySQL server again.
Once these steps have been completed, Tungsten Cluster should be operating as the replication service for your MySQL servers. Use the information in Chapter 6, Operations Guide to monitor and administer the service.
When running an existing MySQL native replication service that needs to be migrated to a Tungsten Cluster service, one solution is to create the new Tungsten Cluster service, synchronize the content, and then install a service that migrates data from the existing native service to the new service while applications are reconfigured to use the new service. The two can then be executed in parallel until applications have been migrated.
The basic structure is shown in Figure 3.10, “Migration: Migrating Native Replication using a New Service”. The migration consists of two steps:
Initializing the new service with the current database state.
Creating a Tungsten Replicator deployment that continues to replicate data from the native MySQL service to the new service.
Once the application has been switched and is executing against the new
service, the secondary replication can be disabled by shutting down the
Tungsten Replicator in /opt/replicator
.
To configure the service:
Stop replication on a Replica for the existing native replication installation :
mysql> STOP SLAVE;
Obtain the current Replica position within the Primary binary log :
mysql> SHOW SLAVE STATUS\G
...
Master_Host: host3
Relay_Master_Log_File: mysql-bin.000002
Exec_Master_Log_Pos: 559
...
Create a backup using any method that provides a consistent snapshot.
The MySQL Primary may be used if you do not have a Replica to backup
from. Be sure to get the binary log position as part of your back.
This is included in the output to Xtrabackup or
using the
--master-data=2
option with mysqldump.
Restart the Replica using native replication :
mysql> START SLAVE;
On the Primary and each Replica within the new service, restore the backup data and start the database service
Setup the new Tungsten Cluster deployment using the MySQL servers on
which the data has been restored. For clarity, this will be called
newalpha
.
Configure a second replication service,
beta
to apply data using the
existing MySQL native replication server as the Primary, and the Primary
of newalpha
. The information
provided in Section 3.8, “Replicating Data Into an Existing Dataservice” will
help. Do not start the new service.
Set the replication position for
beta
using
tungsten_set_position to set the position to the
point within the binary logs where the backup was taken:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/tungsten_set_position \
--seqno=0 --epoch=0 --service=beta
\
--source-id=host3
--event-id=mysql-bin.000002:559
Start replicator service beta
:
shell> /opt/replicator/tungsten/tungsten-replicator/bin/replicator start
Once replication has been started, use trepctl to check the status and ensure that replication is operating correctly.
The original native MySQL replication Primary can continue to be used for
reading and writing from within your application, and changes will be
replicated into the new service on the new hardware. Once the applications
have been updated to use the new service, the old servers can be
decommissioned and replicator service
beta
stopped and removed.
Once the Tungsten Replicator is installed, it can be used to provision all Replicas with the Primary data. The Replicas will need enough information in order for the installation to succeed and for Tungsten Replicator to start. The provisioning process requires dumping all data on the Primary and reloading it back into the Primary server. This will create a full set of THL entries for the Replica replicators to apply. There may be no other applications accessing the Primary server while this process is running. Every table will be emptied out and repopulated so other applications would get an inconsistent view of the database. If the Primary is a MySQL Replica, then the Replica process may be stopped and started to prevent any changes without affecting other servers.
If you are using a MySQL Replica as the Primary, stop the replication thread :
mysql> STOP SLAVE;
Check Tungsten Replicator status on all servers to make sure it is
ONLINE
and that the
appliedLastSeqno
values are matching :
shell> trepctl status
Starting the process before all servers are consistent could cause inconsistencies. If you are trying to completely reprovision the server then you may consider running trepctl reset before proceeding. That will reset the replication position and ignore any previous events on the Primary.
Use mysqldump to output all of the schemas that need to be provisioned :
shell> mysqldump --opt --skip-extended-insert -hhost3
-utungsten
-P13306 -p \
--databases db1,db2
> ~/dump.sql
Optionally, you can just dump a set of tables to be provisioned :
shell> mysqldump --opt --skip-extended-insert -hhost3
-utungsten
-P13306 -p \
db1
table1 table2
> ~/dump.sql
If you are using heterogeneous replication all tables on the Replica
must be empty before proceeding. The Tungsten Replicator does not
replicate DDL statements such as DROP
TABLE
and CREATE
TABLE
. You may either truncate the tables on the Replica or
use ddlscan to recreate them.
Load the dump file back into the Primary to recreate all data :
shell> cat ~/dump.sql | tpm mysql
The Tungsten Replicator will read the binary log as the dump file is loaded into MySQL. The Replicas will automatically apply these statements through normal replication.
If you are using a MySQL Replica as the Primary, restart the replication thread after the dump file as completed loading :
mysql> START SLAVE;
Monitor replication status on the Primary and Replicas :
shell> trepctl status
Table of Contents
The following sections provide guidance and instructions for creating advanced deployments, including configuration automatic startup and shutdown during boot procedures, upgrades, downgrades, and removal of Tungsten Cluster.
Parallel apply is an important technique for achieving high speed replication and curing Replica lag. It works by spreading updates to Replicas over multiple threads that split transactions on each schema into separate processing streams. This in turn spreads I/O activity across many threads, which results in faster overall updates on the Replica. In ideal cases throughput on Replicas may improve by up to 5 times over single-threaded MySQL native replication.
It is worth noting that the only thing Tungsten parallelizes is applying transactions to Replicas. All other operations in each replication service are single-threaded.
Parallel replication works best on workloads that meet the following criteria:
ROW based binary logging must be enabled in the MySQL database.
Data are stored in independent schemas. If you have 100 customers per server with a separate schema for each customer, your application is a good candidate.
Transactions do not span schemas. Tungsten serializes such transactions, which is to say it stops parallel apply and runs them by themselves. If more than 2-3% of transactions are serialized in this way, most of the benefits of parallelization are lost.
Workload is well-balanced across schemas.
The Replica host(s) are capable and have free memory in the OS page cache.
The host on which the Replica runs has a sufficient number of cores to operate a large number of Java threads.
Not all workloads meet these requirements. If your transactions are within a single schema only, you may need to consider different approaches, such as Replica prefetch. Contact Continuent for other suggestions.
Parallel replication does not work well on underpowered hosts, such as Amazon m1.small instances. In fact, any host that is already I/O bound under single-threaded replication will typical will not show much improvement with parallel apply.
Currently, it is not recommended to use the SMARTSCALE connector configuration in conjunction with Parallel Apply. This is due to progress only being measured against the slowest channel.
Parallel apply is enabled using the
svc-parallelization-type
and
channels
options of
tpm. The parallelization type defaults to
none
which is to say
that parallel apply is disabled. You should set it to
disk
. The
channels
option sets the the number of
channels (i.e., threads) you propose to use for applying data. Here is a
code example of a MySQL Applier installation with parallel apply enabled. The
Replica will apply transactions using 30 channels.
shell>./tools/tpm configure defaults \ --reset \ --install-directory=/opt/continuent \ --user=tungsten \ --mysql-allow-intensive-checks=true \ --profile-script=~/.bash_profile \ --application-port=3306 \ --application-user=app_user \ --application-password=secret \ --replication-port=13306 \ --replication-user=tungsten \ --replication-password=secret \ --svc-parallelization-type=disk \ --connector-smartscale=false # parallel apply and smartscale are not compatible \ --channels=10 \ --rest-api-admin-user=apiuser \ --rest-api-admin-pass=secret
shell>./tools/tpm configure alpha \ --master=host1 \ --members=host1,host2,host3 \ --connectors=host1,host2,host3 \ --topology=clustered
shell> vi /etc/tungsten/tungsten.ini
[defaults] install-directory=/opt/continuent user=tungsten mysql-allow-intensive-checks=true profile-script=~/.bash_profile application-port=3306 application-user=app_user application-password=secret replication-port=13306 replication-user=tungsten replication-password=secret svc-parallelization-type=disk connector-smartscale=false # parallel apply and smartscale are not compatible channels=10 rest-api-admin-user=apiuser rest-api-admin-pass=secret
[alpha] master=host1 members=host1,host2,host3 connectors=host1,host2,host3 topology=clustered
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
For staging configurations, deletes all pre-existing configuration information between updating with the new configuration values.
--install-directory=/opt/continuent
install-directory=/opt/continuent
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
System User
--mysql-allow-intensive-checks=true
mysql-allow-intensive-checks=true
For MySQL installation, enables detailed checks on the supported data types within the MySQL database to confirm compatibility. This includes checking each table definition individually for any unsupported data types.
--profile-script=~/.bash_profile
profile-script=~/.bash_profile
Append commands to include env.sh in this profile script
Port for the connector to listen on
Database username for the connector
Database password for the connector
The network port used to connect to the database server. The default port used depends on the database being configured.
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
The password to be used when connecting to the database using
the corresponding
--replication-user
.
--svc-parallelization-type=disk
Method for implementing parallel apply
--connector-smartscale=false # parallel apply and smartscale are not compatible
connector-smartscale=false # parallel apply and smartscale are not compatible
Enable SmartScale R/W splitting in the connector
Number of replication channels to use for parallel apply.
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
The hostname of the primary (extractor) within the current service.
Hostnames for the dataservice members
--connectors=host1,host2,host3
Hostnames for the dataservice connectors
Replication topology for the dataservice.
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file for
more information about the root cause.
There are several additional options that default to reasonable values. You may wish to change them in special cases.
buffer-size
— Sets the
replicator block commit size, which is the number of transactions to
commit at once on Replicas. Values up to 100 are normally fine.
native-slave-takeover
— Used
to allow Tungsten to take over from native MySQL replication and
parallelize it. See here for more.
You can check the number of active channels on a Replica by looking at the "channels" property once the replicator restarts.
Replica shell> trepctl -service alpha status| grep channels
channels : 10
The channel count for a Primary will ALWAYS be 1 because extraction is single-threaded:
Primary shell> trepctl -service alpha status| grep channels
channels : 1
Enabling parallel apply will dramatically increase the number of connections to the database server.
Typically the calculation on a Replica would be: Connections = Channel_Count x Sevice_Count x 2, so for a 4-way Composite Composite Active/Active topology with 30 channels there would be 30 x 4 x 2 = 240 connections required for the replicator alone, not counting application traffic.
You may display the currently used number of connections in MySQL:
mysql> SHOW STATUS LIKE 'max_used_connections';
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| Max_used_connections | 190 |
+----------------------+-------+
1 row in set (0.00 sec)
Below are suggestions for how to change the maximum connections setting in MySQL both for the running instance as well as at startup:
mysql>SET GLOBAL max_connections = 512;
mysql>SHOW VARIABLES LIKE 'max_connections';
+-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 512 | +-----------------+-------+ 1 row in set (0.00 sec) shell>vi /etc/my.cnf
#max_connections = 151 max_connections = 512
Channels and Parallel Apply
Parallel apply works by using multiple threads for the final stage of the
replication pipeline. These threads are known as channels. Restart points
for each channel are stored as individual rows in table
trep_commit_seqno
if you are
applying to a relational DBMS server, including MySQL, Oracle, and data
warehouse products like Vertica.
When you set the channels
argument, the
tpm program configures the replication service to
enable the requested number of channels. A value of 1 results in
single-threaded operation.
Do not change the number of channels without setting the replicator offline cleanly. See the procedure later in this page for more information.
How Many Channels Are Enough?
Pick the smallest number of channels that loads the Replica fully. For evenly distributed workloads this means that you should increase channels so that more threads are simultaneously applying updates and soaking up I/O capacity. As long as each shard receives roughly the same number of updates, this is a good approach.
For unevenly distributed workloads, you may want to decrease channels to spread the workload more evenly across them. This ensures that each channel has productive work and minimizes the overhead of updating the channel position in the DBMS.
Once you have maximized I/O on the DBMS server leave the number of channels alone. Note that adding more channels than you have shards does not help performance as it will lead to idle channels that must update their positions in the DBMS even though they are not doing useful work. This actually slows down performance a little bit.
Effect of Channels on Backups
If you back up a Replica that operates with more than one channel, say 30, you can only restore that backup on another Replica that operates with the same number of channels. Otherwise, reloading the backup is the same as changing the number of channels without a clean offline.
When operating Tungsten Replicator in a Tungsten cluster, you should always set the number of channels to be the same for all replicators. Otherwise you may run into problems if you try to restore backups across MySQL instances that load with different locations.
If the replicator has only a single channel enabled, you can restore the backup anywhere. The same applies if you run the backup after the replicator has been taken offline cleanly.
When you issue a trepctl offline command, Tungsten Replicator will bring all channels to the same point in the log and then go offline. This is known as going offline cleanly. When a Replica has been taken offline cleanly the following are true:
The trep_commit_seqno
table
contains a single row
The trep_shard_channel
table
is empty
When parallel replication is not enabled, you can take the replicator offline by stopping the replicator process. There is no need to issue a trepctl offline command first.
Putting a replicator offline may take a while if the slowest and fastest
channels are far apart, i.e., if one channel gets far ahead of another.
The separation between channels is controlled by the
maxOfflineInterval
parameter, which defaults to 5
seconds. This sets the allowable distance between commit timestamps
processed on different channels. You can adjust this value at
installation or later. The following example shows how to change it
after installation. This can be done at any time and does not require
the replicator to go offline cleanly.
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--property=replicator.store.parallel-queue.maxOfflineInterval=30
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
shell> vi /etc/tungsten/tungsten.ini
[alpha]
...
property=replicator.store.parallel-queue.maxOfflineInterval=30
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
The offline interval is only the the approximate time that Tungsten Replicator will take to go offline. Up to a point, larger values (say 60 or 120 seconds) allow the replicator to parallelize in spite of a few operations that are relatively slow. However, the down side is that going offline cleanly can become quite slow.
If you need to take a replicator offline quickly, you can either stop the replicator process or issue the following command:
shell> trepctl offline -immediate
Both of these result in an unclean shutdown. However, parallel replication is completely crash-safe provided you use transactional table types like InnoDB, so you will be able to restart without causing Replica consistency problems.
You must take the replicator offline cleanly to change the number of channels or when reverting to MySQL native replication. Failing to do so can result in errors when you restart replication.
Be sure to place the cluster into MAINTENANCE mode first so the Manager does not attempt to automatically bring the replicator online.
cctrl> set policy maintenance
To enable parallel replication after installation, take the replicator offline cleanly using the following command:
shell> trepctl offline
Modify the configuration to add two parameters:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure defaults \
--svc-parallelization-type=disk \
--channels=10
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[defaults]
...
svc-parallelization-type=disk
channels=10
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
You make use an actual data service name in place of the keyword defaults
.
Signal the changes by a complete restart of the Replicator process:
shell> replicator restart
Be sure to place the cluster into AUTOMATIC mode as soon as all replicators are updated and back online.
cctrl> set policy automatic
You can check the number of active channels on a Replica by looking at the "channels" property once the replicator restarts.
Replica shell> trepctl -service alpha status| grep channels
channels : 10
The channel count for a Primary will ALWAYS be 1 because extraction is single-threaded:
Primary shell> trepctl -service alpha status| grep channels
channels : 1
Enabling parallel apply will dramatically increase the number of connections to the database server.
Typically the calculation on a Replica would be: Connections = Channel_Count x Sevice_Count x 2, so for a 4-way Composite Composite Active/Active topology with 30 channels there would be 30 x 4 x 2 = 240 connections required for the replicator alone, not counting application traffic.
You may display the currently used number of connections in MySQL:
mysql> SHOW STATUS LIKE 'max_used_connections';
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| Max_used_connections | 190 |
+----------------------+-------+
1 row in set (0.00 sec)
Below are suggestions for how to change the maximum connections setting in MySQL both for the running instance as well as at startup:
mysql>SET GLOBAL max_connections = 512;
mysql>SHOW VARIABLES LIKE 'max_connections';
+-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 512 | +-----------------+-------+ 1 row in set (0.00 sec) shell>vi /etc/my.cnf
#max_connections = 151 max_connections = 512
To change the number of channels you must take the replicator offline cleanly using the following command:
shell> trepctl offline
This command brings all channels up the same transaction in the log,
then goes offline. If you look in the
trep_commit_seqno
table, you will
notice only a single row, which shows that updates to the Replica have
been completely serialized to a single point. At this point you may
safely reconfigure the number of channels on the replicator, for example
using the following command:
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--channels=5
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[alpha]
...
channels=5
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
You can check the number of active channels on a Replica by looking at the "channels" property once the replicator restarts.
If you attempt to reconfigure channels without going offline cleanly,
Tungsten Replicator will signal an error when you attempt to go online
with the new channel configuration. The cure is to revert to the
previous number of channels, go online, and then go offline cleanly.
Note that attempting to clean up the
trep_commit_seqno
and
trep_shard_channel
tables manually
can result in your Replicas becoming inconsistent and requiring full
resynchronization. You should only do such cleanup under direction from
Continuent support.
Failing to follow the channel reconfiguration procedure carefully may result in your Replicas becoming inconsistent or failing. The cure is usually full resynchronization, so it is best to avoid this if possible.
The following steps describe how to gracefully disable parallel apply replication.
To disable parallel apply, you must first take the replicator offline cleanly using the following command:
shell> trepctl offline
This command brings all channels up the same transaction in the log,
then goes offline. If you look in the
trep_commit_seqno
table, you will
notice only a single row, which shows that updates to the Replica have
been completely serialized to a single point. At this point you may
safely disable parallel apply on the replicator, for example using the
following command:
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--svc-parallelization-type=none \
--channels=1
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[alpha]
...
svc-parallelization-type=none
channels=1
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
You can check the number of active channels on a Replica by looking at the "channels" property once the replicator restarts.
shell> trepctl -service alpha status| grep channels
channels : 1
If you attempt to reconfigure channels without going offline cleanly,
Tungsten Replicator will signal an error when you attempt to go online
with the new channel configuration. The cure is to revert to the
previous number of channels, go online, and then go offline cleanly.
Note that attempting to clean up the
trep_commit_seqno
and
trep_shard_channel
tables manually
can result in your Replicas becoming inconsistent and requiring full
resynchronization. You should only do such cleanup under direction from
Continuent support.
Failing to follow the channel reconfiguration procedure carefully may result in your Replicas becoming inconsistent or failing. The cure is usually full resynchronization, so it is best to avoid this if possible.
As with channels you should only change the parallel queue type after the replicator has gone offline cleanly. The following example shows how to update the parallel queue type after installation:
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--svc-parallelization-type=disk \
--channels=5
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[alpha]
...
svc-parallelization-type=disk
channels=5
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
Basic monitoring of a parallel deployment can be performed using the techniques in Chapter 6, Operations Guide. Specific operations for parallel replication are provided in the following sections.
The replicator has several helpful commands for tracking replication performance:
Command | Description |
---|---|
trepctl status | Shows basic variables including overall latency of Replica and number of apply channels |
trepctl status -name shards | Shows the number of transactions for each shard |
trepctl status -name stores | Shows the configuration and internal counters for stores between tasks |
trepctl status -name tasks | Shows the number of transactions (events) and latency for each independent task in the replicator pipeline |
The trepctl status appliedLastSeqno parameter shows the sequence number of the last transaction committed. Here is an example from a Replica with 5 channels enabled.
shell> trepctl status
Processing status command...
NAME VALUE
---- -----
appliedLastEventId : mysql-bin.000211:0000000020094456;0
appliedLastSeqno : 78021
appliedLatency : 0.216
channels : 5
...
Finished status command...
When parallel apply is enabled, the meaning of
appliedLastSeqno
changes. It is the minimum
recovery position across apply channels, which means it is the position
where channels restart in the event of a failure. This number is quite
conservative and may make replication appear to be further behind than
it actually is.
Busy channels mark their position in table
trep_commit_seqno
as they
commit. These are up-to-date with the traffic on that channel, but
channels have latency between those that have a lot of big
transactions and those that are more lightly loaded.
Inactive channels do not get any transactions, hence do not mark
their position. Tungsten sends a control event across all channels
so that they mark their commit position in
trep_commit_channel
. It is
possible to see a delay of many seconds or even minutes in unloaded
systems from the true state of the Replica because of idle channels
not marking their position yet.
For systems with few transactions it is useful to lower the synchronization interval to a smaller number of transactions, for example 500. The following command shows how to adjust the synchronization interval after installation:
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--property=replicator.store.parallel-queue.syncInterval=500
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[alpha]
...
property=replicator.store.parallel-queue.syncInterval=500
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
Note that there is a trade-off between the synchronization interval
value and writes on the DBMS server. With the foregoing setting, all
channels will write to the
trep_commit_seqno
table every 500
transactions. If there were 50 channels configured, this could lead to
an increase in writes of up to 10%—each channel could end up
adding an extra write to mark its position every 10 transactions. In
busy systems it is therefore better to use a higher synchronization
interval for this reason.
You can check the current synchronization interval by running the trepctl status -name stores command, as shown in the following example:
shell> trepctl status -name stores
Processing status command (stores)...
...
NAME VALUE
---- -----
...
name : parallel-queue
...
storeClass : com.continuent.tungsten.replicator.thl.THLParallelQueue
syncInterval : 10000
Finished status command (stores)...
You can also force all channels to mark their current position by sending a heartbeat through using the trepctl heartbeat command.
Relative latency is a trepctl status parameter. It indicates the latency since the last time the appliedSeqno advanced; for example:
shell> trepctl status
Processing status command...
NAME VALUE
---- -----
appliedLastEventId : mysql-bin.000211:0000000020094766;0
appliedLastSeqno : 78022
appliedLatency : 0.571
...
relativeLatency : 8.944
Finished status command...
In this example the last transaction had a latency of .571 seconds from the time it committed on the Primary and committed 8.944 seconds ago. If relative latency increases significantly in a busy system, it may be a sign that replication is stalled. This is a good parameter to check in monitoring scripts.
Serialization count refers to the number of transactions that the replicator has handled that cannot be applied in parallel because they involve dependencies across shards. For example, a transaction that spans multiple shards must serialize because it might cause cause an out-of-order update with respect to transactions that update a single shard only.
You can detect the number of transactions that have been serialized by
looking at the serializationCount
parameter using
the trepctl status -name stores command. The
following example shows a replicator that has processed 1512
transactions with 26 serialized.
shell> trepctl status -name stores
Processing status command (stores)...
...
NAME VALUE
---- -----
criticalPartition : -1
discardCount : 0
estimatedOfflineInterval: 0.0
eventCount : 1512
headSeqno : 78022
maxOfflineInterval : 5
maxSize : 10
name : parallel-queue
queues : 5
serializationCount : 26
serialized : false
...
Finished status command (stores)...
In this case 1.7% of transactions are serialized. Generally speaking you will lose benefits of parallel apply if more than 1-2% of transactions are serialized.
The maximum offline interval (maxOfflineInterval
)
parameter controls the "distance" between the fastest and slowest
channels when parallel apply is enabled. The replicator measures
distance using the seconds between commit times of the last transaction
processed on each channel. This time is roughly equivalent to the amount
of time a replicator will require to go offline cleanly.
You can change the maxOfflineInterval
as shown in
the following example, the value is defined in seconds.
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--property=replicator.store.parallel-queue.maxOfflineInterval=30
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[alpha]
...
property=replicator.store.parallel-queue.maxOfflineInterval=30
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
You can view the configured value as well as the estimate current value using the trepctl status -name stores command, as shown in yet another example:
shell> trepctl status -name stores
Processing status command (stores)...
NAME VALUE
---- -----
...
estimatedOfflineInterval: 1.3
...
maxOfflineInterval : 30
...
Finished status command (stores)...
Parallel apply works best when transactions are distributed evenly across shards and those shards are distributed evenly across available channels. You can monitor the distribution of transactions over shards using the trepctl status -name shards command. This command lists transaction counts for all shards, as shown in the following example.
shell> trepctl status -name shards
Processing status command (shards)...
...
NAME VALUE
---- -----
appliedLastEventId: mysql-bin.000211:0000000020095076;0
appliedLastSeqno : 78023
appliedLatency : 0.255
eventCount : 3523
shardId : cust1
stage : q-to-dbms
...
Finished status command (shards)...
If one or more shards have a very large
eventCount
value compared to the others, this is
a sign that your transaction workload is poorly distributed across
shards.
The listing of shards also offers a useful trick for finding serialized
transactions. Shards that Tungsten Replicator cannot safely parallelize
are assigned the dummy shard ID
#UNKNOWN
. Look for this shard to
find the count of serialized transactions. The
appliedLastSeqno
for this shard gives the
sequence number of the most recent serialized transaction. As the
following example shows, you can then list the contents of the
transaction to see why it serialized. In this case, the transaction
affected tables in different schemas.
shell>trepctl status -name shards
Processing status command (shards)... NAME VALUE ---- ----- appliedLastEventId: mysql-bin.000211:0000000020095529;0 appliedLastSeqno : 78026 appliedLatency : 0.558 eventCount : 26 shardId : #UNKNOWN stage : q-to-dbms ... Finished status command (shards)... shell>thl list -seqno 78026
SEQ# = 78026 / FRAG# = 0 (last frag) - TIME = 2013-01-17 22:29:42.0 - EPOCH# = 1 - EVENTID = mysql-bin.000211:0000000020095529;0 - SOURCEID = logos1 - METADATA = [mysql_server_id=1;service=percona;shard=#UNKNOWN] - TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent - OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 0, » foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client = 8, » collation_connection = 8, collation_server = 33] - SCHEMA = - SQL(0) = insert into mats_0.foo values(1) /* ___SERVICE___ = [percona] */ - OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 0, » foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client = 8, » collation_connection = 8, collation_server = 33] - SQL(1) = insert into mats_1.foo values(1)
The replicator normally distributes shards evenly across channels. As
each new shard appears, it is assigned to the next channel number, which
then rotates back to 0 once the maximum number has been assigned. If the
shards have uneven transaction distributions, this may lead to an uneven
number of transactions on the channels. To check, use the
trepctl status -name tasks and look for tasks
belonging to the q-to-dbms
stage.
shell> trepctl status -name tasks
Processing status command (tasks)...
...
NAME VALUE
---- -----
appliedLastEventId: mysql-bin.000211:0000000020095076;0
appliedLastSeqno : 78023
appliedLatency : 0.248
applyTime : 0.003
averageBlockSize : 2.520
cancelled : false
currentLastEventId: mysql-bin.000211:0000000020095076;0
currentLastFragno : 0
currentLastSeqno : 78023
eventCount : 5302
extractTime : 274.907
filterTime : 0.0
otherTime : 0.0
stage : q-to-dbms
state : extract
taskId : 0
...
Finished status command (tasks)...
If you see one or more channels that have a very high
eventCount
, consider either assigning shards
explicitly to channels or redistributing the workload in your
application to get better performance.
Tungsten Replicator by default assigns channels using a round robin
algorithm that assigns each new shard to the next available channel. The
current shard assignments are tracked in table
trep_shard_channel
in the Tungsten
catalog schema for the replication service.
For example, if you have 2 channels enabled and Tungsten processes three different shards, you might end up with a shard assignment like the following:
foo => channel 0 bar => channel 1 foobar => channel 0
This algorithm generally gives the best results for most installations and
is crash-safe, since the contents of the
trep_shard_channel
table persist if
either the DBMS or the replicator fails.
It is possible to override the default assignment by updating the
shard.list
file found in the
tungsten-replicator/conf
directory. This file normally looks like the following:
# SHARD MAP FILE. # This file contains shard handling rules used in the ShardListPartitioner # class for parallel replication. If unchanged shards will be hashed across # available partitions. # You can assign shards explicitly using a shard name match, where the form # is <db>=<partition>. #common1=0 #common2=0 #db1=1 #db2=2 #db3=3 # Default partition for shards that do not match explicit name. # Permissible values are either a partition number or -1, in which # case values are hashed across available partitions. (-1 is the # default. #(*)=-1 # Comma-separated list of shards that require critical section to run. # A "critical section" means that these events are single-threaded to # ensure that all dependencies are met. #(critical)=common1,common2 # Method for channel hash assignments. Allowed values are round-robin and # string-hash. (hash-method)=round-robin
You can update the shard.list file to do three types of custom overrides.
Change the hashing method for channel assignments. Round-robin uses
the trep_shard_channel
table.
The string-hash method just hashes the shard name.
Assign shards to explicit channels. Add lines of the form
shard=channel
to the file as
shown by the commented-out entries.
Define critical shards. These are shards that must be processed in serial fashion. For example if you have a sharded application that has a single global shard with reference information, you can declare the global shard to be critical. This helps avoid applications seeing out of order information.
Changes to shard.list must be made with care. The same cautions apply here as for changing the number of channels or the parallelization type. For subscription customers we strongly recommend conferring with Continuent Support before making changes.
Channels receive transactions through a special type of queue, known as a
parallel queue. Tungsten offers two implementations of parallel queues,
which vary in their performance as well as the requirements they may place
on hosts that operate parallel apply. You choose the type of queue to
enable using the
--svc-parallelization-type
option.
Do not change the parallel queue type without setting the replicator offline cleanly. See the procedure later in this page for more information.
Disk Parallel Queue
(disk
option)
A disk parallel queue uses a set of independent threads to read from the Transaction History Log and feed short in-memory queues used by channels. Disk queues have the advantage that they minimize memory required by Java. They also allow channels to operate some distance apart, which improves throughput. For instance, one channel may apply a transaction that committed 2 minutes before the transaction another channel is applying. This separation keeps a single slow transaction from blocking all channels.
Disk queues minimize memory consumption of the Java VM but to function
efficiently they do require pages from the Operating System page cache.
This is because the channels each independently read from the Transaction
History Log. As long as the channels are close together the storage pages
tend to be present in the Operating System page cache for all threads but
the first, resulting in very fast reads. If channels become widely
separated, for example due to a high
maxOfflineInterval
value, or the host has
insufficient free memory, disk queues may operate slowly or impact other
processes that require memory.
Memory Parallel Queue
(memory
option)
A memory parallel queue uses a set of in-memory queues to hold transactions. One stage reads from the Transaction History Log and distributes transactions across the queues. The channels each read from one of the queues. In-memory queues have the advantage that they do not need extra threads to operate, hence reduce the amount of CPU processing required by the replicator.
When you use in-memory queues you must set the maxSize property on the queue to a relatively large value. This value sets the total number of transaction fragments that may be in the parallel queue at any given time. If the queue hits this value, it does not accept further transaction fragments until existing fragments are processed. For best performance it is often necessary to use a relatively large number, for example 10,000 or greater.
The following example shows how to set the maxSize property after installation. This value can be changed at any time and does not require the replicator to go offline cleanly:
Click the link below to switch examples between Staging and INI methods...
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--property=replicator.store.parallel-queue.maxSize=10000
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
[alpha]
...
property=replicator.store.parallel-queue.maxSize=10000
Run the tpm command to update the software with the INI-based configuration:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>cd {STAGING_DIRECTORY}
shell>./tools/tpm update
For information about making updates when using an INI file, please see Section 10.4.4, “Configuration Changes with an INI file”.
You may need to increase the Java VM heap size when you increase the
parallel queue maximum size. Use the
--java-mem-size
option on the
tpm command for this purpose or edit the Replicator
wrapper.conf
file directly.
Memory queues are not recommended for production use at this time. Use disk queues.
This feature was introduced in v7.1.0
Tungsten Distributed Datasource Groups (DDG) are, at their core, a single Standalone cluster, with an odd number of nodes, as usual.
In addition, every node in the cluster uses the same [serviceName], also as usual. The key differences here are that:
Each node in the cluster is assigned a Distributed Datasource Group ID (DDG-ID)
Nodes with the same DDG-ID will act as if they are part of a separate cluster, limiting failovers to nodes inside the group until there are no more failover candidates, at which time a node in a different groupvirtual ID will be selected as the new primary during a failover.
This means that you would assign nodes in the same region or datacenter the same DDG-ID.
There is still only a single write Primary amongst all the nodes in all the regions, just like Composite Active/Active (CAP).
Unlike CAP, if all nodes in the datacenter containing the Primary node were gone, a node in a different location would be promoted to Primary.
The networks between the datacenters or regions must be of low latency similar to LAN speed for this feature to work properly.
Also, the node in the same group with the most THL downloaded will be selected as the new Primary. If no node is available in the same group, the node with the most THL available is selected from a different group.
To illustrate the new topology, imagine a 5-node standard cluster spanning 3 datacenters with 2 nodes in DC-A, 2 nodes in DC-B and 1 node in DC-C.
Nodes in DC-A have DDG-ID of 100, nodes in DC-B have DDG-ID of 200, and nodes in DC-C have DDG-ID of 300.
Below are the failure scenarios and resulting actions:
Primary fails
Failover to any healthy Replica in the same Region/Datacenter (virtual ID group)
Entire Region/Datacenter containing the Primary node fails
Failover to any healthy Replica in a different Region/Datacenter (virtual ID group)
Network partition between any two Regions/Datacenters
No action, quorum is maintained by the majority of Managers.
Application servers not in the Primary Datacenter will fail to connect
Network partition between all Regions/Datacenters
All nodes FAILSAFE/SHUNNED
Any two Regions/Datacenters offline
All nodes FAILSAFE/SHUNNED
Manual intervention to recover the cluster will be required any time the cluster is placed into the FAILSAFE/SHUNNED state.
When configured as per the above example, the ls output from within cctrl will look like the following:
DATASOURCES: +---------------------------------------------------------------------------------+ |db1-demo.continuent.com(master:ONLINE, progress=0, THL latency=0.495) | |STATUS [OK] [2023/06/23 05:46:52 PM UTC][SSL] | |DATASOURCE GROUP(id=100) | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=master, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |db2-demo.continuent.com(slave:ONLINE, progress=0, latency=0.978) | |STATUS [OK] [2023/06/23 05:46:51 PM UTC][SSL] | |DATASOURCE GROUP(id=100) | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db1-demo.continuent.com, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=4, active=0) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |db3-demo.continuent.com(slave:ONLINE, progress=0, latency=0.705) | |STATUS [OK] [2023/06/23 05:46:51 PM UTC][SSL] | |DATASOURCE GROUP(id=200) | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db1-demo.continuent.com, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=4, active=0) | +---------------------------------------------------------------------------------+ +---------------------------------------------------------------------------------+ |db4-demo.continuent.com(slave:ONLINE, progress=0, latency=2.145) | |STATUS [OK] [2023/06/23 05:46:54 PM UTC][SSL] | |DATASOURCE GROUP(id=200) | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | | REPLICATOR(role=slave, master=db1-demo.continuent.com, state=ONLINE) | | DATASERVER(state=ONLINE) | | CONNECTIONS(created=0, active=0) | +---------------------------------------------------------------------------------+ WITNESSES: +---------------------------------------------------------------------------------+ |db5-demo.continuent.com(witness:ONLINE) | |DATASOURCE GROUP(id=300) | +---------------------------------------------------------------------------------+ | MANAGER(state=ONLINE) | +---------------------------------------------------------------------------------+
Configuration is very easy, just pick an integer ID for each set of nodes you wish to group together, usually based upon location like region or datacenter.
Folow the steps for deploying and configuring a standard cluster detailed at
Section 3.1, “Deploying Standalone HA Clusters” with one simple addition to the configuration,
just add an additional new line to the [defaults]
section
of the /etc/tungsten/tungsten.ini
file on every node,
including Connector-only nodes, for example:
[defaults] datasource-group-id=100
The new tpm configuration option datasource-group-id
defines which
Distributed Datasource Group that node belongs to. The new entry must be in the
[defaults]
section of the configuration.
Ommitting datasource-group-id
from your configuration or setting the
value to 0 disables this feature. A positive integer, >0 will enable DDG.
To stop all of the services associated with a dataservice node, use the stopall script:
shell> stopall
Stopping Tungsten Connector...
Stopped Tungsten Connector.
Stopping Tungsten Replicator Service...
Stopped Tungsten Replicator Service.
Stopping Tungsten Manager Service...
Stopped Tungsten Manager Service.
To start all services, use the startall script:
shell> startall
Starting Tungsten Manager Service...
Starting Tungsten Replicator Service...
Starting Tungsten Connector...
Restarting a running replicator temporarily stops and restarts
replication. Either set
MAINTENANCE
mode within
cctrl (see Section 6.15, “Performing Database or OS Maintenance”
or shun the datasource before restarting the replicator
(Section 6.3.6.1, “Shunning a Datasource”.
To shutdown a running Tungsten Replicator you must switch off the replicator:
shell> replicator stop
Stopping Tungsten Replicator Service...
Stopped Tungsten Replicator Service.
To start the replicator service if it is not already running:
shell> replicator start
Starting Tungsten Replicator Service...
Restarting the connector service will interrupt the communication of any running application or client connecting through the connector to MySQL.
To shutdown a running Tungsten Connector you must switch off the replicator:
shell> connector stop
Stopping Tungsten Connector Service...
Stopped Tungsten Connector Service.
To start the replicator service if it is not already running:
shell> connector start
Starting Tungsten Connector Service...
Waiting for Tungsten Connector Service.....
running: PID:12338
If the cluster was configured with
auto-enable=false
then you will need to
put each node online individually.
The manager service is designed to monitor the status and operation of the each of the datasources within the dataservice. In the event that the manager has become confused with the current configuration, for example due to a network or node failure, the managers can be restarted. This forces the managers to update their current status and topology information.
Before restarting managers, the dataservice should be placed in maintenance policy mode. In maintenance mode, the connectors will continue to service requests and the manager restart will not be treated as a failure.
To restart the managers across an entire dataservice, each manager will need to be restarted. The dataservice must be placed in maintenance policy mode first, then:
To set the maintenance policy mode:
[LOGICAL:EXPERT] /dsone > set policy maintenance
On each datasource in the dataservice:
Stop the service:
shell> manager stop
Then start the manager service:
shell> manager start
Once all the managers have been restarted, set the policy mode back to the automatic:
[LOGICAL:EXPORT] /alpha > set policy automatic
policy mode is now AUTOMATIC
Restarting a running replicator temporarily stops and restarts replication. When using Multi-Site/Active-Active, restarting the additional replicator will stop replication between sites.
These instructions assume you have installed the additional replicator
with the --executable-prefix=mm
option.
If not, you should go to
/opt/replicator/tungsten/tungsten-replicator/bin
and run the replicator command directly.
To shutdown a running Tungsten Replicator you must switch off the replicator:
shell> mm_replicator stop
Stopping Tungsten Replicator Service...
Stopped Tungsten Replicator Service.
To start the replicator service if it is not already running:
shell> mm_replicator start
Starting Tungsten Replicator Service...
By default, Tungsten Cluster does not start automatically on boot. To enable Tungsten Cluster to start at boot time, use the deployall script provided in the installation directory to create the necessary boot scripts:
shell> sudo /opt/continuent/tungsten/cluster-home/bin/deployall
Adding system startup for /etc/init.d/tmanager ...
/etc/rc0.d/K80tmanager -> ../init.d/tmanager
/etc/rc1.d/K80tmanager -> ../init.d/tmanager
/etc/rc6.d/K80tmanager -> ../init.d/tmanager
/etc/rc2.d/S80tmanager -> ../init.d/tmanager
/etc/rc3.d/S80tmanager -> ../init.d/tmanager
/etc/rc4.d/S80tmanager -> ../init.d/tmanager
/etc/rc5.d/S80tmanager -> ../init.d/tmanager
Adding system startup for /etc/init.d/treplicator ...
/etc/rc0.d/K81treplicator -> ../init.d/treplicator
/etc/rc1.d/K81treplicator -> ../init.d/treplicator
/etc/rc6.d/K81treplicator -> ../init.d/treplicator
/etc/rc2.d/S81treplicator -> ../init.d/treplicator
/etc/rc3.d/S81treplicator -> ../init.d/treplicator
/etc/rc4.d/S81treplicator -> ../init.d/treplicator
/etc/rc5.d/S81treplicator -> ../init.d/treplicator
Adding system startup for /etc/init.d/tconnector ...
/etc/rc0.d/K82tconnector -> ../init.d/tconnector
/etc/rc1.d/K82tconnector -> ../init.d/tconnector
/etc/rc6.d/K82tconnector -> ../init.d/tconnector
/etc/rc2.d/S82tconnector -> ../init.d/tconnector
/etc/rc3.d/S82tconnector -> ../init.d/tconnector
/etc/rc4.d/S82tconnector -> ../init.d/tconnector
/etc/rc5.d/S82tconnector -> ../init.d/tconnector
To disable automatic startup at boot time, use the undeployall command:
shell> sudo /opt/continuent/tungsten/cluster-home/bin/undeployall
Because there is an additional Tungsten Replicator running, each must be individually configured to startup on boot:
For the Tungsten Cluster service, use Section 4.4, “Configuring Startup on Boot”.
For the Tungsten Replicator service, a custom startup script must be created, otherwise the replicator will be unable to start as it has been configured in a different directory.
Create a link from the Tungsten Replicator service startup script in
the operating system startup directory
(/etc/init.d
):
shell> sudo ln -s /opt/replicator/tungsten/tungsten-replicator/bin/replicator /etc/init.d/mmreplicator
Stop the Tungsten Replicator process. Failure to do this will cause issues because the service will no longer recognize the existing PID file and report it is not running.
shell> /etc/init.d/mmreplicator stop
Modify the APP_NAME
variable
within the startup script
(/etc/init.d/mmreplicator
)
to mmreplicator:
APP_NAME="mmreplicator"
Start the Tungsten Replicator process.
shell> /etc/init.d/mmreplicator start
Update the operating system startup configuration to use the updated script.
On Debian/Ubuntu:
shell> sudo update-rc.d mmreplicator defaults
On RedHat/CentOS:
shell> sudo chkconfig --add mmreplicator
To upgrade an existing installation of Tungsten Cluster, the new distribution must be downloaded and unpacked, and the included tpm command used to update the installation. The upgrade process implies a small period of downtime for the cluster as the updated versions of the tools are restarted, However the process that takes place should not present as an outage to your applications providing steps when upgrading the connectors are followed carefully. Any downtime is deliberately kept to a minimum, and the cluster should be in the same operation state once the upgrade has finished as it was when the upgrade was started.
During the update process, the cluster will be in MAINTENANCE
mode.
This is intentional to prevent unwanted failovers during the process, however it is
important to understand that should the primary fail for genuine reasons NOT
associated with the upgrade, then failover will also not happen at that time.
It is important to ensure clusters are returned to the AUTOMATIC
state
as soon as all Maintenance operations are complete and the cluster is stable.
It is NOT advised to perform rolling upgrades of the tungsten software to avoid miscommunication between components running older/newer versions of the software that may prevent switches/failovers from occuring, therefore it is recommended to upgrade all nodes in place. The process of the upgrade places the cluster into MAINTENANCE mode which in itself avoids outages whilst components are restarted, and allows for a successful upgrade.
From version 7.1.0 onwards, the JGroup libraries were upgraded, this means that when upgrading to any release from 7.1.0 onwards FROM any release OLDER than 7.1.0, all nodes must be upgraded before full cluster communication will be restored. For that reason upgrades to 7.1.0 onwards from an OLDER release MUST be performed together, ensuring the cluster is only running with a mix of manager versions for as little time as possible. When upgrading nodes, do NOT SHUN the node otherwise you will not be able to recover the node into the cluster until all nodes are upgraded, which could result in an outage to your applications. Additionally, do NOT perform a switch until all nodes are upgraded. This means you should upgrade the master node in situ. Providing the cluster is in MAINTENANCE, this will not cause an outage and the cluster can still be upgraded with no visible outage to you applications.
For INI file upgrades, see Section 4.5.2, “Upgrading when using INI-based configuration, or without ssh Access”
Before performing and upgrade, please ensure that you have checked the Appendix B, Prerequisites, as software and system requirements may have changed between versions and releases.
To perform an upgrade of an entire cluster from a staging directory installation, where you have ssh access to the other hosts in the cluster:
On your staging server, download the release package.
Unpack the release package:
shell> tar zxf tungsten-clustering-7.1.4-10.tar.gz
Change to the extracted directory:
shell> cd tungsten-clustering7.1.4-10
The next step depends on your existing deployment:
If you are upgrading a Multi-Site/Active-Active deployment:
If you installed the original service by making use of the
$CONTINUENT_PROFILES
and
$REPLICATOR_PROFILES
environment variables, no
further action needs to be taken to update the configuration
information. Confirm that these variables are set before
performing the validation and update.
If you did not use these environment variables when deploying the solution, you must load the existing configuration from the current hosts in the cluster before continuing by using tpm fetch:
shell> ./tools/tpm fetch --hosts=east1,east2,east3,west1,west2,west3 \
--user=tungsten --directory=/opt/continuent
You must specify ALL the hosts
within both clusters within the current deployment when fetching
the configuration; use of the
autodetect
keyword will
not collect the correct information.
If you are upgrading any other deployment:
If you are are using the $CONTINUENT_PROFILES
variable to specify a location for your configuration, make sure
that the variable has been set correctly.
If you are not using $CONTINUENT_PROFILES
, a copy
of the existing configuration must be fetched from the installed
Tungsten Cluster installation:
shell> ./tools/tpm fetch --hosts=host1,host2,host3,autodetect \
--user=tungsten --directory=/opt/continuent
You must use the version of tpm from within the staging directory (./tools/tpm) of the new release, not the tpm installed with the current release.
The current configuration information will be retrieved to be used for the upgrade:
shell> ./tools/tpm fetch --hosts=host1,host2,host3 --user=tungsten --directory=/opt/continuent
.......
NOTE >> Configuration loaded from host1,host2,host3
Check that the update configuration matches what you expect by using tpm reverse:
shell>./tools/tpm reverse
# Options for the dsone data service tools/tpm configure dsone \ --application-password=password \ --application-port=3306 \ --application-user=app_user \ --connectors=host1,host2,host3 \ --datasource-log-directory=/var/log/mysql \ --install-directory=/opt/continuent \ --master=host1 \ --members=host1,host2,host3 \ '--profile-script=~/.bashrc' \ --replication-password=password \ --replication-port=13306 \ --replication-user=tungsten \ --start-and-report=true \ --user=tungsten \ --witnesses=192.168.0.1
Run the upgrade process:
shell> ./tools/tpm update
During the update process, tpm may report errors or warnings that were not previously reported as problems. This is due to new features or functionality in different MySQL releases and Tungsten Cluster updates. These issues should be addressed and the tpm update command re-executed.
The following additional options are available when updating:
--no-connectors
(optional)
By default, an update process will restart all services, including the connector. Adding this option prevents the connectors from being restarted. If this option is used, the connectors must be manually updated to the new version during a quieter period. This can be achieved by running on each host the command:
shell> tpm promote-connector
This will result in a short period of downtime (couple of seconds) only on the host concerned, while the other connectors in your configuration keep running. During the upgrade, the Connector is restarted using the updated software and/or configuration.
A successful update will report the cluster status as determined from each host in the cluster:
...........................................................................................................
Getting cluster status on host1
Tungsten Clustering (for MySQL) 7.1.4 build 10
connect to 'dsone@host1'
dsone: session established
[LOGICAL] /dsone > ls
COORDINATOR[host3:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[31613](ONLINE, created=0, active=0) |
|connector@host2[27649](ONLINE, created=0, active=0) |
|connector@host3[21475](ONLINE, created=0, active=0) |
+----------------------------------------------------------------------------+
...
#####################################################################
# Next Steps
#####################################################################
We have added Tungsten environment variables to ~/.bashrc.
Run `source ~/.bashrc` to rebuild your environment.
Once your services start successfully you may begin to use the cluster.
To look at services and perform administration, run the following command
from any database server.
$CONTINUENT_ROOT/tungsten/tungsten-manager/bin/cctrl
Configuration is now complete. For further information, please consult
Tungsten documentation, which is available at docs.continuent.com.
NOTE >> Command successfully completed
The update process should now be complete. The current version can be confirmed by starting cctrl.
To perform an upgrade of an individual node, tpm can be used on the individual host. The same method can be used to upgrade an entire cluster without requiring tpm to have ssh access to the other hosts in the dataservice.
Before performing and upgrade, please ensure that you have checked the Appendix B, Prerequisites, as software and system requirements may have changed between versions and releases.
Application traffic to the nodes will be disconnected when the connector
restarts. Use the --no-connectors
tpm option when you upgrade to prevent the connectors
from restarting until later when you want them to.
To upgrade:
Place the cluster into maintenance mode
Upgrade the Replicas in the dataservice. Be sure to shun and welcome each Replica.
Upgrade the Primary node
Replication traffic to the Replicas will be delayed while the replicator restarts. The delays will increase if there are a large number of stored events in the THL. Old THL may be removed to decrease the delay. Do NOT delete THL that has not been received on all Replica nodes or events will be lost.
Upgrade the connectors in the dataservice one-by-one
Application traffic to the nodes will be disconnected when the connector restarts.
Place the cluster into automatic mode
For more information on performing maintenance across a cluster, see Section 6.15.3, “Performing Maintenance on an Entire Dataservice”.
To upgrade a single host using the tpm command:
Download the release package.
Unpack the release package:
shell> tar zxf tungsten-clustering-7.1.4-10.tar.gz
Change to the extracted directory:
shell> cd tungsten-clustering-7.1.4-10
Execute tpm update, specifying the installation directory. This will update only this host:
shell> ./tools/tpm update --replace-release
To update all of the nodes within a cluster, the steps above will need to be performed individually on each host.
These steps are designed to guide you in the safe conversion of an existing Multi-Site/Active-Active (MSAA) topology to a Composite Active/Passive (CAP) topology, based on an ini installation.
For details of the difference between these two topologies, please review the following pages:
It is very important to follow all the below steps and ensure full backups are taken when instructed. These steps can be destructive and without proper care and attention, data loss, data corruption or a split-brain scenario can happen.
Parallel apply MUST be disabled before starting your upgrade. You may re-enable it once the upgrade has been fully completed. See Section 4.1.5.3, “How to Disable Parallel Replication Safely” and Section 4.1.2, “Enabling Parallel Apply During Install” for more information.
The examples in this section are based on three clusters named 'nyc', 'london' and 'tokyo'
Each cluster has two dedicated connectors on separate hosts.
The converted cluster will consist of a Composite Service named 'global' and the 'nyc' cluster will be the Active cluster, with 'london' and 'tokyo' as Passive clusters.
If you do not have exactly three clusters, please adjust this procedure to match your environment.
Examples of before and after tungsten.ini files can be downloaded here:
If you are currently installed using a staging-based installation, you must
convert to an INI based installed for this process to be completed with
minimal risk and minimal interuption. For notes on how
to perform the staging to INI file conversion using the
translatetoini.pl
script, please visit Section 10.4.6, “Using the translatetoini.pl
Script”.
Parallel apply MUST be disabled before starting your upgrade. You may re-enable it once the upgrade has been fully completed. See Section 4.1.5.3, “How to Disable Parallel Replication Safely” and Section 4.1.2, “Enabling Parallel Apply During Install” for more information.
Obtain the latest Tungsten Cluster software build and place it
within /opt/continuent/software
If you are not upgrading, just converting, then this step is not required since you will already have the extracted software bundle available.
Extract the package
The examples below refer to the
tungsten_prep_upgrade
script, this can be located
in the extracted software package within the
tools
directory.
Take a full and complete backup of one node - this can be a Replica, and preferably should be either performed by:
Percona xtrabackup whilst database is open
Manual backup of all datafiles after stopping the database instance
A big difference between Multi-Site/Active-Active (MSAA) and Composite Active/Passive (CAP) is that with MSAA, clients can write into all custers. With CAP clients only write into a single cluster.
To be able to complete this conversion process with minimal interuption and risk, it is essential that clients are redirected and only able to write into a single cluster. This cluster will become the ACTIVE custer after the conversion. For the purpose of this procedure, we will use the 'nyc' cluster for this role.
After redirecting you client applications to connect through the connectors associated with the 'nyc' cluster, stop the connectors associated with the remaining clusters as an extra safeguard against writes happening
On every connector node associated with london and tokyo :
shell> connector stop
Enable Maintenance mode on all clusters using the cctrl command:
shell>cctrl
cctrl>set policy maintenance
Typically the cross-site replicators will be installed within
/opt/replicator
, if you have installed this in a
different location you will need to pass this to the script in the
examples using the --path option
The following commands tell the replicators to go offline at a specific point, in this case when they receive an explicit heartbeat. This is to ensure that all the replicators stop at the same sequence number and binary log position. The replicators will NOT be offline until the explicit heartbeat has been issued a bit later in this step.
On every nyc node:
shell>./tungsten_prep_upgrade -o
~or~ shell>./tungsten_prep_upgrade --service london --offline
shell>./tungsten_prep_upgrade --service tokyo --offline
On every london node:
shell>./tungsten_prep_upgrade -o
~or~ shell>./tungsten_prep_upgrade --service nyc --offline
shell>./tungsten_prep_upgrade --service tokyo --offline
On every tokyo node:
shell>./tungsten_prep_upgrade -o
~or~ shell>./tungsten_prep_upgrade --service london --offline
shell>./tungsten_prep_upgrade --service tokyo --offline
Next, on the Primary hosts within each
cluster we issue the heartbeat, execute the following using the
cluster-specific trepctl, typically in
/opt/continuent
:
shell> trepctl heartbeat -name offline_for_upg
Ensure that every cross-site replicator on every node is now in the
OFFLINE:NORMAL
state:
shell>mmtrepctl status
~or~ shell>mmtrepctl --service {servicename} status
Capture the position of the cross-site replicators on all nodes in all clusters.
The service name provided should be the name of the remote service(s) for this cluster, so for example in the london cluster you get the positions for nyc and tokyo, and in nyc you get the position for london and tokyo, etc.
On every london node:
shell>./tungsten_prep_upgrade -g
~or~ shell>./tungsten_prep_upgrade --service nyc --get
(NOTE: saves to ~/position-nyc-YYYYMMDDHHMMSS.txt) shell>./tungsten_prep_upgrade --service tokyo --get
(NOTE: saves to ~/position-tokyo-YYYYMMDDHHMMSS.txt)
On every nyc node:
shell>./tungsten_prep_upgrade -g
~or~ shell>./tungsten_prep_upgrade --service london --get
(NOTE: saves to ~/position-london-YYYYMMDDHHMMSS.txt) shell>./tungsten_prep_upgrade --service tokyo --get
(NOTE: saves to ~/position-tokyo-YYYYMMDDHHMMSS.txt)
On every tokyo node:
shell>./tungsten_prep_upgrade -g
~or~ shell>./tungsten_prep_upgrade --service london --get
(NOTE: saves to ~/position-london-YYYYMMDDHHMMSS.txt) shell>./tungsten_prep_upgrade --service nyc --get
(NOTE: saves to ~/position-nyc-YYYYMMDDHHMMSS.txt)
Finally, to complete this step, stop the cross-site replicators on all nodes:
shell> ./tungsten_prep_upgrade --stop
On every node in each intended Passive cluster (london and tokyo), export the tracking schema associated the intended Active cluster (nyc)
Note the generated dump file is called tungsten_global.dmp
.
global refers to the name of the intended Composite Cluster service, if you choose
a different service name, change this accordingly.
On every london node:
shell> mysqldump --opt --single-transaction tungsten_nyc > ~/tungsten_global.dmp
On every tokyo node:
shell> mysqldump --opt --single-transaction tungsten_nyc > ~/tungsten_global.dmp
To uninstall the cross-site replicators, execute the following on every node:
shell>cd {replicator software path}
shell>tools/tpm uninstall --i-am-sure
In this step, we pre-create the database for the composite service tracking schema, we are using global as the service name in this example, if you choose a different Composite service name, adjust this accordingly
On every node in all clusters:
shell> mysql -e 'set session sql_log_bin=0; create database tungsten_global'
This step reloads the tracking schema associated with the intended Active cluster (nyc) into the tracking schema we created in the previous step. This should ONLY be carried out within the intended Passive clusters at this stage.
We DO NOT want the reloading of this schema to appear in the binary logs on the Primary, therefore the reload needs to be performed on each node individually:
On every london node:
shell> mysql -e 'set session sql_log_bin=0; use tungsten_global; source ~/tungsten_global.dmp;'
On every tokyo node:
shell> mysql -e 'set session sql_log_bin=0; use tungsten_global; source ~/tungsten_global.dmp;'
On every node in every cluster:
shell> replicator stop
The effect of this step will now mean that only the Primary node in the Active cluster will be up to date with ongoing data changes. You must ensure that your applications handle this accordingly until the replicators are restarted at Step 14
This step, if not followed correctly, could be destructive to the entire conversion. It is CRITICAL that this step is NOT performed on the intended Active cluster (nyc)
By default, THL files will be located within
/opt/continuent/thl
, if you have configured this in a
different location you will need to adjust the path below accordingly
On every london node:
shell>cd /opt/continuent/thl
shell>rm */thl*
On every tokyo node:
shell>cd /opt/continuent/thl
shell>rm */thl*
On every node within the intended Active cluster (nyc), export the tracking schema associated with the local service
Note the generated dump file is called tungsten_global.dmp
.
global refers to the name of the intended Composite Cluster service, if you choose
a different service name, change this accordingly.
On every nyc node:
shell> mysqldump --opt --single-transaction tungsten_nyc > ~/tungsten_global.dmp
This step reloads the tracking schema associated with the intended Active cluster (nyc) into the tracking schema we created in the earlier step.
We DO NOT want the reloading of this schema to appear in the binary logs on the Primary, therefore the reload needs to be performed on each node individually:
On every nyc node:
shell> mysql -e 'set session sql_log_bin=0; use tungsten_global; source ~/tungsten_global.dmp;'
Update /etc/tungsten/tungsten.ini
to a valid Composite Active/Passive
config. An example of a valid config is as follows, a sample can also be downloaded from
Section 4.5.3.1, “Conversion Prerequisites” above:
Within a Composite Active/Passive topology, the ini file must be identical on EVERY node, including Connector Nodes
[defaults] user=tungsten home-directory=/opt/continuent application-user=app_user application-password=secret application-port=3306 profile-script=~/.bash_profile replication-user=tungsten replication-password=secret mysql-allow-intensive-checks=true skip-validation-check=THLSchemaChangeCheck [nyc] topology=clustered master=db1 slaves=db2,db3 connectors=nyc-conn1,nyc-conn2 [london] topology=clustered master=db4 slaves=db5,db6 connectors=ldn-conn1,ldn-conn2b6 relay-source=nyc [tokyo] topology=clustered master=db7 slaves=db8,db9 connectors=tky-conn1,tky-conn2 relay-source=nyc [global] composite-datasources=nyc,london,tokyo
Validate and install the new release on all nodes in the Active (nyc) cluster only:
shell>cd /opt/continuent/software/tungsten-clustering-7.1.4-10
shell>tools/tpm validate-update
If validation shows no errors, run the install:
shell> tools/tpm update --replace-release
After the installation is complete on all nodes in the Active cluster, restart the replicator services:
shell> replicator start
After restarting, check the status of the replicator using the trepctl and check that all replicators are ONLINE:
shell> trepctl status
Validate and install the new release on all nodes in the remaining Passive clusters (london and tokyo):
The update should be performed on the Primary nodes within each cluster
first, validation will report and error that the roles conflict (Primary vs Relay).
This is expected and to override this warning the -f
options should
be used on the Primary nodes only
shell>cd /opt/continuent/software/tungsten-clustering-7.1.4-10
shell>tools/tpm validate-update
If validation shows no errors, run the install:
On Primary Nodes: shell>tools/tpm update --replace-release -f
On Replica Nodes: shell>tools/tpm update --replace-release
After the installation is complete on all nodes in the Active cluster, restart the replicator services:
shell> replicator start
After restarting, check the status of the replicator using the trepctl and check that all replicators are ONLINE:
shell> trepctl status
Following the upgrades, there are a number of "clean-up" steps that we need to perform within cctrl to ensure the datasource roles have been converted from the previous "master" roles to "relay" roles.
The following steps can be performed in a single cctrl session initiated from any node within any cluster
shell>cctrl
Connect to Active cluster cctrl>use nyc
Check Status and verify all nodes online cctrl>ls
Connect to COMPOSITE service cctrl>use global
Place Active service online cctrl>datasource nyc online
Connect to london Passive service cctrl>use london
Convert old Primary to relay cctrl>set force true
cctrl>datasource
cctrl>oldPrimaryhost
offlinedatasource
Repeat on tokyo Passive service cctrl>oldPrimaryhost
relayuse tokyo
cctrl>set force true
cctrl>datasource
cctrl>oldPrimaryhost
offlinedatasource
Connect to COMPOSITE service cctrl>oldPrimaryhost
relayuse global
Place Passive services online cctrl>datasource london online
cctrl>datasource tokyo online
Place all clusters into AUTOMATIC cctrl>set policy automatic
Validate and install the new release on all connectors nodes:
shell>cd /opt/continuent/software/tungsten-clustering-7.1.4-10
shell>tools/tpm validate-update
If validation shows no errors, run the install:
shell> tools/tpm update --replace-release
After upgrading previously stopped connectors, you will need to restart the process:
shell> connector restart
Upgrading a running connector will initiate a restart of the connector services, this will result in any active connections being terminated, therefore care should be taken with this process and client redirection should be handled accordingly prior to any connector upgrade/restart
These steps are specifically for the safe and successful upgrade (or conversion) of an existing Multi-Site/Active-Active (MSAA) topology, to a Composite Active/Active (CAA) topology.
It is very important to follow all the below steps and ensure full backups are taken when instructed. These steps can be destructive and without proper care and attention, data loss, data corruption or a split-brain scenario can happen.
Parallel apply MUST be disabled before starting your upgrade/conversion. You may re-enable it once the process has been fully completed. See Section 4.1.5.3, “How to Disable Parallel Replication Safely” and Section 4.1.2, “Enabling Parallel Apply During Install” for more information.
The examples in this section are based on three clusters named 'nyc', 'london' and 'tokyo'
If you do not have exactly three clusters, please adjust this procedure to match your environment.
Click here for a video of the upgrade procedure, showing the full process from start to finish...
If you are currently installed using a staging-based installation, you must
convert to an INI based installed, since INI based installation is the only
option supported for the Composite Active/Active deployments. For notes on how
to perform the staging to INI file conversion using the
translatetoini.pl
script, please visit Section 10.4.6, “Using the translatetoini.pl
Script”.
Parallel apply MUST be disabled before starting your upgrade. You may re-enable it once the upgrade has been fully completed. See Section 4.1.5.3, “How to Disable Parallel Replication Safely” and Section 4.1.2, “Enabling Parallel Apply During Install” for more information.
Obtain the latest v6 (or greater) Tungsten Cluster software build and place it
within /opt/continuent/software
If you are not upgrading, just converting, then this step is not required since you will already have the extracted software bundle available. However you must be running v6 or greater of Tungsten Cluster to deploy a CAA topology.
Extract the package
The examples below refer to the
tungsten_prep_upgrade
script, this can be located
in the extracted software package within the
tools
directory.
Take a full and complete backup of one node - this can be a Replica, and preferably should be either performed by:
Percona xtrabackup whilst database is open
Manual backup of all datafiles after stopping the database instance
Typically the cross-site replicators will be installed within
/opt/replicator
, if you have installed this in a
different location you will need to pass this to the script in the
examples using the --path option
The following commands tell the replicators to go offline at a specific point, in this case when they receive an explicit heartbeat. This is to ensure that all the replicators stop at the same sequence number and binary log position. The replicators will NOT be offline until the explicit heartbeat has been issued a bit later in this step.
On every nyc node:
shell>./tungsten_prep_upgrade -o
~or~ shell>./tungsten_prep_upgrade --service london --offline
shell>./tungsten_prep_upgrade --service tokyo --offline
On every london node:
shell>./tungsten_prep_upgrade -o
~or~ shell>./tungsten_prep_upgrade --service nyc --offline
shell>./tungsten_prep_upgrade --service tokyo --offline
On every tokyo node:
shell>./tungsten_prep_upgrade -o
~or~ shell>./tungsten_prep_upgrade --service london --offline
shell>./tungsten_prep_upgrade --service tokyo --offline
Next, on the Primary hosts within each
cluster we issue the heartbeat, execute the following using the
cluster-specific trepctl, typically in
/opt/continuent
:
shell> trepctl heartbeat -name offline_for_upg
Ensure that every cross-site replicator on every node is now in the
OFFLINE:NORMAL
state:
shell>mmtrepctl status
~or~ shell>mmtrepctl --service {servicename} status
Capture the position of the cross-site replicators on all nodes in all clusters.
The service name provided should be the name of the remote service(s) for this cluster, so for example in the london cluster you get the positions for nyc and tokyo, and in nyc you get the position for london and tokyo, etc.
On every london node:
shell>./tungsten_prep_upgrade -g
~or~ shell>./tungsten_prep_upgrade --service nyc --get
(NOTE: saves to ~/position-nyc-YYYYMMDDHHMMSS.txt) shell>./tungsten_prep_upgrade --service tokyo --get
(NOTE: saves to ~/position-tokyo-YYYYMMDDHHMMSS.txt)
On every nyc node:
shell>./tungsten_prep_upgrade -g
~or~ shell>./tungsten_prep_upgrade --service london --get
(NOTE: saves to ~/position-london-YYYYMMDDHHMMSS.txt) shell>./tungsten_prep_upgrade --service tokyo --get
(NOTE: saves to ~/position-tokyo-YYYYMMDDHHMMSS.txt)
On every tokyo node:
shell>./tungsten_prep_upgrade -g
~or~ shell>./tungsten_prep_upgrade --service london --get
(NOTE: saves to ~/position-london-YYYYMMDDHHMMSS.txt) shell>./tungsten_prep_upgrade --service nyc --get
(NOTE: saves to ~/position-nyc-YYYYMMDDHHMMSS.txt)
Finally, to complete this step, stop the replicators on all nodes:
shell> ./tungsten_prep_upgrade --stop
On every node in each cluster, export the tracking schema for the cross-site replicator
Similar to the above step 2 when you captured the cross-site position, the same applies here, in london you export/backup nyc and tokyo, and in nyc you export/backup london and tokyo, and finally in tokyo you export/backup nyc and london.
On every london node:
shell>./tungsten_prep_upgrade -d --alldb
~or~ shell>./tungsten_prep_upgrade --service nyc --dump
shell>./tungsten_prep_upgrade --service tokyo --dump
On every nyc node:
shell>./tungsten_prep_upgrade -d --alldb
~or~ shell>./tungsten_prep_upgrade --service london --dump
shell>./tungsten_prep_upgrade --service tokyo --dump
On every tokyo node:
shell>./tungsten_prep_upgrade -d --alldb
~or~ shell>./tungsten_prep_upgrade --service london --dump
shell>./tungsten_prep_upgrade --service nyc --dump
To uninstall the cross-site replicators, execute the following on every node:
shell>cd {replicator software path}
shell>tools/tpm uninstall --i-am-sure
We DO NOT want the reloading of this schema to appear in the binary logs on the Primary, therefore the reload needs to be performed on each node individually:
On every london node:
shell>./tungsten_prep_upgrade -s nyc -u tungsten -w secret -r
shell>./tungsten_prep_upgrade -s tokyo -u tungsten -w secret -r
~or~ shell>./tungsten_prep_upgrade --service nyc --user tungsten --password secret --restore
shell>./tungsten_prep_upgrade --service tokyo --user tungsten --password secret --restore
On every tokyo node:
shell>./tungsten_prep_upgrade -s london -u tungsten -w secret -r
shell>./tungsten_prep_upgrade -s nyc -u tungsten -w secret -r
~or~ shell>./tungsten_prep_upgrade --service london --user tungsten --password secret --restore
shell>./tungsten_prep_upgrade --service nyc --user tungsten --password secret --restore
On every nyc node:
shell>./tungsten_prep_upgrade -s london -u tungsten -w secret -r
shell>./tungsten_prep_upgrade -s tokyo -u tungsten -w secret -r
~or~ shell>./tungsten_prep_upgrade --service london --user tungsten --password secret --restore
shell>./tungsten_prep_upgrade --service tokyo --user tungsten --password secret --restore
Update /etc/tungsten/tungsten.ini
to a valid v6 CAA
configuration. An example of a valid configuration is as follows:
[defaults] user=tungsten home-directory=/opt/continuent application-user=app_user application-password=secret application-port=3306 profile-script=~/.bash_profile replication-user=tungsten replication-password=secret mysql-allow-intensive-checks=true skip-validation-check=THLSchemaChangeCheck start-and-report=true [nyc] topology=clustered master=db1 members=db1,db2,db3 connectors=db1,db2,db3 [london] topology=clustered master=db4 members=db4,db5,db6 connectors=db4,db5,db6 [tokyo] topology=clustered master=db7 members=db8,db8,db9 connectors=db7,db8,db9 [global] topology=composite-multi-master composite-datasources=nyc,london,tokyo
It is critical that you ensure the master=
entry in the configuration
matches the current, live Primary host in your cluster for the purpose of this process.
Enable Maintenance mode on all clusters using the cctrl command:
shell>cctrl
cctrl>set policy maintenance
Run the update as follows:
shell> tools/tpm update --replace-release
If you had start-and-report=false you may need to restart manager services
Until all nodes have been updated, the output from cctrl may show services in an OFFLINE, STOPPED, or UNKNOWN state. This is to be expected until all the new v6 managers are online
After the installation is complete on all nodes, start the manager services:
shell> manager start
Return all clusters to Automatic mode using the cctrl command:
shell>cctrl
cctrl>set policy automatic
Identify the cross-site service name(s):
shell> trepctl services
In our example, the local cluster service will one of
london
, nyc
or
tokyo
depending on the node you are on. The cross
site replication services would be:
(within the london cluster) london_from_nyc london_from_tokyo (within the nyc cluster) nyc_from_london nyc_from_tokyo (within the tokyo cluster) tokyo_from_london tokyo_from_nyc
Upon installation, the new cross-site replicators will come online, it
is possible that they may be in an OFFLINE:ERROR
state due to a change in Epoch numbers, check this on the
Primary in each cluster by looking at
the output from the trepctl command.
Check each service as needed based on the status seen above:
shell>trepctl -service london_from_nyc status
shell>trepctl -service london_from_tokyo status
~or~ shell>trepctl -service nyc_from_london status
shell>trepctl -service nyc_from_tokyo status
~or~ shell>trepctl -service tokyo_from_london status
shell>trepctl -service tokyo_from_nyc status
If the replicator is in an error state due to an epoch difference, you will see an error similar to the following:
pendingErrorSeqno : -1 pendingExceptionMessage: Client handshake failure: Client response validation failed: Log epoch numbers do not match: master source ID=db1 client source ID=db4 seqno=4 server epoch number=0 client epoch number=4 pipelineSource : UNKNOWN
The above error is due to the epoch numbers changing as a result of the replicators being restarted, and the new replicators being installed.
To resolve, simply force the replicator online as follows:
shell>trepctl -service london_from_nyc online -force
shell>trepctl -service london_from_tokyo online -force
~or~ shell>trepctl -service nyc_from_london online -force
shell>trepctl -service nyc_from_tokyo online -force
~or~ shell>trepctl -service tokyo_from_london online -force
shell>trepctl -service tokyo_from_nyc online -force
If the replicator shows an error state similar to the following:
pendingErrorSeqno : -1 pendingExceptionMessage: Client handshake failure: Client response validation failed: Master log does not contain requested transaction: master source ID=db1 client source ID=db2 requested seqno=1237 client epoch number=0 master min seqno=5 master max seqno=7 pipelineSource : UNKNOWN
The above error is possible if during install the Replica replicators came online before the Primary.
Providing the steps above have been followed, just bringing the replicator online should be enough to get the replicator to retry and carry on successfully:
shell>trepctl -service london_from_nyc online
shell>trepctl -service london_from_tokyo online
~or~ shell>trepctl -service nyc_from_london online
shell>trepctl -service nyc_from_tokyo online
~or~ shell>trepctl -service tokyo_from_london online
shell>trepctl -service tokyo_from_nyc online
Known Issue (CT-569)
During an upgrade, the tpm process will incorrectly create additional, empty, tracking schemas based on the service names of the auto-generated cross-site services.
For example, if your cluster has service names east and west, you should only have tracking schemas for tungsten_east and tungsten_west
In some cases, you will also see tungsten_east_from_west and/or tungsten_west_from_east
These tungsten_x_from_y tracking schemas will be empty
and unused. They can be safely removed by issuing DROP DATABASE
tungsten_x_from_y
on a Primary node,
or they can be safely ignored
The following instructions should only be used if Continuent Support have explicitly provided you with a customer JAR file designed to address a problem with your deployment.
If a custom JAR has been provided by Continuent Support, the following instructions can be used to install the JAR into your installation.
Determine your staging directory or untarred installation directory:
shell> tpm query staging
Go to the appropriate host (if necessary) and the staging directory.
shell> cd tungsten-clustering-7.1.4-10
Change to the correct directory. For example, to update
Tungsten Replicator change to
tungsten-replicator/lib
; for Tungsten Manager use
tungsten-manager/lib
; for Tungsten Connector use
tungsten-connector/lib
:
shell> cd tungsten-replicator/lib
Copy the existing JAR to a backup file:
shell> cp tungsten-replicator.jar
tungsten-replicator.jar.orig
Copy the replacement JAR into the directory:
shell> cp /tmp/tungsten-replicator.jar
.
Change back to the root directory of the staging directory:
shell> cd ../..
Update the release:
shell> ./tools/tpm update --replace-release
This procedure should only be followed with the advice and guidance of a Continuent Support Engineer.
There are two ways we can patch the running environment, and the method chosen will depend on the severity of the patch and whether or not your use case would allow for a maintenance window
Upgrade using a full software update following the standard upgrade procedures
Use the patch command to patch just the files necessary
From time to time, Continuent may provide you with a patch to apply as a quicker way to fix small issues. Patched software will always be provided in a subsequent release so the manual patch method described here should only be used as a temporary measure to patch a live installation when a full software update may not immediately be possible
You will have been supplied with a file containing the patch, for the purpose
of this example we will assume the file you have been given is called
undeployallnostop.patch
Place cluster into maintenance
mode
On each node of your installation:
Copy the supplied patch file to the host
From the installed directory (Typically this would be /opt/continuent
) issue the following:
shell>cd /opt/continuent/tungsten
shell>patch -p1 -i undeployallnostop.patch
Return cluster to automatic
mode
If a tpm update --replace-release is issued from the original software staging directory, the manual patch applied above will be over-written and removed.
The manual patch method is a temporary approach to patching a running environment, but is not a total replacement for a proper upgrade.
Following a manual patch, you MUST plan to upgrade the staged software to avoid reverting to an unpatched system.
If in doubt, always check with a Continuent Support Engineer.
v7 is a major release with many changes, specifically to security. At this time, upgrading directly to v7 is only supported from v5 onwards. If security is NOT enabled in your installation, then upgrading from an older release may work, however any issues encountered will not be addressed and upgrading to v6 first will be the advised route.
Whilst every care has been taken to ensure upgrades are as smooth and easy as possible, ALWAYS ensure full backups are taken before proceeding, and if possible, test the upgrade on a non-Production environment first.
Prior to v7, Tungsten came with security turned OFF through the tpm flag
disable-security-controls
set to
true
by default. This flag, when set to
false
would translate to the following settings being
applied:
file-protection-level=0027
rmi-ssl=true
thl-ssl=true
rmi-authentication=true
jgroups-ssl=true
This would enable SSL communication between Tungsten components. However, connection to the database remained unencrypted, which would translate to the following settings being applied:
datasource-enable-ssl=false
connector-ssl=false
Setting these to true is possible, however there are many more manual steps that would have been required.
v7 enables full security by default, so the
disable-security-controls
flag will
default to false
when not specified.
In addition to the default value changing,
disable-security-controls
now enables
encrypted communication to the database. Setting this value to
false
, now translates to the following settings being
applied:
file-protection-level=0027
rmi-ssl=true
thl-ssl=true
rmi-authentication=true
jgroups-ssl=true
datasource-enable-ssl=true
connector-ssl=true
In summary, this change in behavior means that upgrades need to be handled with care and appropriate decisions being made, both by the tpm process, and by the "human" to decide on what end result is desired. The various options and examples are outlined in the following sections of this document.
This is the easiest and smoothest approach. tpm will
process your configuration and do its best to maintain the same level of
security. In order to achieve that, tpm will
dynamically update your configuration (either the
tungsten.ini
file for INI installs, or the
deploy.cfg
for staging installs) with additional
properties to adjust the level of security to match.
The properties that tpm will add to your configuration will be some or all of the following depending on the initial starting point of your configuration:
disable-security-controls
connector-rest-api-ssl
manager-rest-api-ssl
replicator-rest-api-ssl
datasource-enable-ssl
enable-connector-ssl
You can now proceed with the upgrade, refer to Section 4.5.7.7, “Steps to upgrade using tpm” for the required steps
The following security setting levels can be enabled, and will require user action prior to upgrading. These are:
Internal Encryption and Authentication
Tungsten to Database Encryption
Application (Connector) to Database Encryption
API SSL
Applying all of the above steps will bring full security, equivalent to the default v7 configuration.
The steps to enable will depend on what (if any) security is enabled in your existing installation. The following sections outline the steps required to be performed to enable security for each of the various layers. To understand whether you have configured any of the various layers of security, the following summary will help to understand your configuration:
No Security
If no security has been configured, the installation that you are
starting from will have
disable-security-controls=true
(or it
will not supplied at all) and no additional securoty properties will be
supplied.
Partial Security
The installation that you are starting from will have partial security in place. This could be a combination of any of the following:
Internal encryption is configured
(disable-security-controls=false
),
and/or
Connector encryption is enabled
(enable-connector-ssl=true
and/or
Cluster to the database encryption is enabled
(datasource-enable-ssl=true
or
repl-datasource-enable-ssl=true
)
To upgrade and enable security, you should follow one or more of the following steps based on your requirements. At a minimum, the first step should always be included, the remaining steps are optional.
Prior to running the upgrade, you need to manually create the keystore, to do this follow these steps on one host, and then copy the files to all other hosts in your topology:
db1>mkdir /etc/tungsten/secure
db1>keytool -genseckey -alias jgroups -validity 3650 -keyalg Blowfish -keysize 56 \ -keystore /etc/tungsten/secure/jgroups.jceks -storepass tungsten -keypass tungsten -storetype JCEKS
If you have an INI based install, and this is the only level of security you plan on configuring you should now copy these new keystores to all other hosts in your topology. If you plan to enable SSL at the other remaining layers, or you use a Staging based install, then skip this copy step.
db1> for host in db2 db3 db4 db5 db6; do
ssh ${host} mkdir /etc/tungsten/secure
scp /etc/tungsten/secure/*.jceks ${host}:/etc/tungsten/secure
done
Enabling internal encryption and authentication will also enable API SSL by default.
If you need to enable encryption to the underlying database, now proceed to the next step Section 4.5.7.4, “Enable Tungsten to Database Encryption” before running the upgrade, otherwise you can then start the upgrade by following the steps in Section 4.5.7.7, “Steps to upgrade using tpm”.
The following additional configuration properties will need adding to your existing configuration. The suggested process based on an INI or Staging based install are outlined in the final upgrade steps referenced above.
disable-security-controls=false
connector-rest-api-ssl=true
manager-rest-api-ssl=true
replicator-rest-api-ssl=true
java-jgroups-keystore-path=/etc/tungsten/secure/jgroups.jceks
The following prerequisite steps must be performed before continuing with this step
In this step, you pre-create the various keystores required and register
the MySQL certificates for Tungsten. Execute all of the following steps on
a single host, for example, db1. In the example below it is assumed that
the mysql certificates reside in /etc/mysql/certs
. If
you use the example syntax below, you will also need to ensure the
following directory exists: /etc/tungsten/secure
These commands will import the MySQL certificates into the required Tungsten truststores.
db1>keytool -importkeystore -srckeystore /etc/mysql/certs/client-cert.p12 -srcstoretype PKCS12 \ -destkeystore /etc/tungsten/secure/keystore.jks -deststorepass tungsten -srcstorepass tungsten
db1>keytool -import -alias mysql -file /etc/mysql/certs/ca.pem -keystore /etc/tungsten/secure/truststore.ts \ -storepass tungsten -noprompt
If you have an INI based install, and you do not intend to configure SSL for your applications (via Connectors), or if your connectors reside on remote, dedicated hosts, you should now copy all of the generated keystores and truststores to all of the other hosts. If you use a Staging based install, then skip this copy step.
db1> for host in db2 db3 db4 db5 db6; do
ssh ${host} mkdir /etc/tungsten/secure
scp /etc/tungsten/secure/*.jceks ${host}:/etc/tungsten/secure
scp /etc/tungsten/secure/*.jks ${host}:/etc/tungsten/secure
scp /etc/tungsten/secure/*.ts ${host}:/etc/tungsten/secure
done
If you need to enable encryption to the underlying database from the connectors, now proceed to Section 4.5.7.5, “Enable Connector to Database Encryption” before running the upgrade, alternatively you can now follow the steps outlined in Section 4.5.7.7, “Steps to upgrade using tpm”
The following additional configuration properties will need adding to your existing configuration. The suggested process based on an INI or Staging based install are outlined in the final upgrade steps referenced above.
datasource-enable-ssl=true
java-truststore-path=/etc/tungsten/secure/truststore.ts
java-truststore-password=tungsten
java-keystore-path=/etc/tungsten/secure/keystore.jks
java-keystore-password=tungsten
datasource-mysql-ssl-cert=/etc/mysql/certs/client-cert.pem
datasource-mysql-ssl-key=/etc/mysql/certs/client-key.pem
datasource-mysql-ssl-ca=/etc/mysql/certs/ca.pem
The steps outlined in this section will need to be performed on all nodes where the Connector has been installed
If you are also enabling Internal Encryption, you would have followed the
steps in Section 4.5.7.3, “Setup internal encryption and authentication” and you would have a
number of files already in /etc/tungsten/secure. This next step will
pre-create the keystore and truststore, and register the MySQL
certificates for the Connectors. Execute all of the following steps on a
single host, in this example, db1. In the example below it is assumed the
mysql certificates reside in /etc/mysql/certs
. If you
use the example syntax below, you will need to ensure the following
directory also exists: /etc/tungsten/secure
db1>keytool -importkeystore -srckeystore /etc/mysql/certs/client-cert.p12 \ -srcstoretype PKCS12 -destkeystore /etc/tungsten/secure/tungsten_connector_keystore.jks \ -deststorepass tungsten -srcstorepass tungsten
db1>keytool -import -alias mysql -file /etc/mysql/certs/ca.pem \ -keystore /etc/tungsten/secure/tungsten_connector_truststore.ts -storepass tungsten -noprompt
Now that all of the necessary steps have been taken to create the various keystores, and if you use an INI based install, you now need to copy all of these files to all other hosts in your topology. If you are using a Staging based installation, then skip this copy step.
db1> for host in db2 db3 db4 db5 db6; do
ssh ${host} mkdir /etc/tungsten/secure
scp /etc/tungsten/secure/*.jceks ${host}:/etc/tungsten/secure
scp /etc/tungsten/secure/*.jks ${host}:/etc/tungsten/secure
scp /etc/tungsten/secure/*.ts ${host}:/etc/tungsten/secure
done
Once the steps above have been performed, you can then continue with the upgrade, following the steps outlined in Section 4.5.7.7, “Steps to upgrade using tpm”
The following additional configuration properties will need adding to your existing configuration. The suggested process based on an INI or Staging based install are outlined in the final upgrade steps referenced above.
enable-connector-ssl=true
java-connector-keystore-path=/etc/tungsten/secure/tungsten_connector_keystore.jks
java-connector-keystore-password=tungsten
java-connector-truststore-path=/etc/tungsten/secure/tungsten_connector_truststore.ts
java-connector-truststore-password=tungsten
A prerequisite to enabling full security, is to enable SSL within your database if this isn't already configured. To do this, we can use the mysql_ssl_rsa_setup tool supplied with most distributions of MySQL. If you do not have this tool, or require more detail, you can refer to Section 5.13.1, “Enabling Database SSL”. The steps below summarise the process using the mysql_ssl_rsa_setup
The first step is to setup the directories for the certs, perform this on ALL hosts in your topology:
shell>sudo mkdir -p /etc/mysql/certs
shell>sudo chown -R tungsten: /etc/mysql/certs/
NB: The ownership is temporarily set to tungsten so that the subsequent scp will work between hosts.
This next step should be performed on just one single host, for the purpose of this example we will use db1 as the host:
db1>mysql_ssl_rsa_setup -d /etc/mysql/certs/
db1>openssl pkcs12 -export -inkey /etc/mysql/certs/client-key.pem \ -name mysql -in /etc/mysql/certs/client-cert.pem -out /etc/mysql/certs/client-cert.p12 \ -passout pass:tungsten
When using OpenSSL 3.0 with Java 1.8, you
MUST add the
-legacy
option to the openssl
command.
db1> for host in db2 db3 db4 db5 db6; do
scp /etc/mysql/certs/* ${host}:/etc/mysql/certs
done
Next, on every host we need to reset the directory ownership
shell>sudo chown -R mysql: /etc/mysql/certs/
shell>sudo chmod g+r /etc/mysql/certs/client-*
Now on every host, we need to reconfigure MySQL. Add the following
properties into your my.cnf
[mysqld] ssl-ca=/etc/mysql/certs/ca.pem ssl-cert=/etc/mysql/certs/server-cert.pem ssl-key=/etc/mysql/certs/server-key.pem [client] ssl-cert=/etc/mysql/certs/client-cert.pem ssl-key=/etc/mysql/certs/client-key.pem ssl-ca=/etc/mysql/certs/ca.pem
Next, place your cluster(s) into MAINTENANCE
mode
shell> cctrl cctrl> set policy maintenance
Restart MySQL for the new settings to take effect
shell> sudo service mysqld restart
Finally, return your cluster(s) into AUTOMATIC
mode
shell> cctrl cctrl> set policy automatic
When you are ready to perform the upgrade, the following steps should be followed:
Ensure you place your cluster(s) into MAINTENANCE
mode
If no additional steps taken, and you wish to maintain the same level of security, skip Step 3, and proceed directly to Step 4.
Update your tungsten.ini
and include some, or
all, of the options below depending on which steps you took earlier.
All entries should be placed within the
[defaults]
stanza.
disable-security-controls=false
connector-rest-api-ssl=true
manager-rest-api-ssl=true
replicator-rest-api-ssl=true
java-jgroups-keystore-path=/etc/tungsten/secure/jgroups.jceks
If "Tungsten to Database Encryption" IS configured, also add:
datasource-enable-ssl=true
java-truststore-path=/etc/tungsten/secure/truststore.ts
java-truststore-password=tungsten
java-keystore-path=/etc/tungsten/secure/keystore.jks
java-keystore-password=tungsten
datasource-mysql-ssl-cert=/etc/mysql/certs/client-cert.pem
datasource-mysql-ssl-key=/etc/mysql/certs/client-key.pem
datasource-mysql-ssl-ca=/etc/mysql/certs/ca.pem
If "Tungsten to Database Encryption" IS NOT configured, also add:
datasource-enable-ssl=false
If "Application (Connector) to Database Encryption" IS configured, also add:
enable-connector-ssl=true
java-connector-keystore-path=/etc/tungsten/secure/tungsten_connector_keystore.jks
java-connector-keystore-password=tungsten
java-connector-truststore-path=/etc/tungsten/secure/tungsten_connector_truststore.ts
java-connector-truststore-password=tungsten
If "Application (Connector) to Database Encryption" IS NOT configured, also add:
enable-connector-ssl=false
If start-and-report=true
, remove
this value or set to false
Obtain the TAR or RPM package for your installation. If using a TAR
file unpack this into your software staging tree, typically
/opt/continuent/software
. If you use the INI
install method, this needs to be performed on every host. For
staging install, this applies to the staging host only.
Change into the directory for the software
shell> cd /opt/continuent/software/tungsten-clustering-7.1.4-10
Issue the following command on all hosts.
shell> tools/tpm update --replace-release
When upgrading the connectors, you could include the optional
--no-connectors
option if you wish to control the
restart of the connectors manually
For Multi-Site/Active-Active topologies, you will also need to repeat the steps for the cross-site replicators
Finally, before returning the cluster(s) to
AUTOMATIC
, you will need to sync the new
certificates, created by the upgrade, to all hosts. This step will
be required even if you have disabled security as these files will
be used by the API and also, if you choose to enable it, THL
Encryption.
From one host, copy the certificate and keystore files to ALL other hosts in your topology. The following scp command is an example assuming you are issuing from db1, and the install directory is /opt/continuent:
db1> for host in db2 db3 db4 db5 db6; do
scp /opt/continuent/share/[jpt]* ${host}:/opt/continuent/share
scp /opt/continuent/share/.[jpt]* ${host}:/opt/continuent/share
done
The examples assume you have the ability to scp between hosts as the tungsten OS user. If your security restrictions do not permit this, you will need to use alternative procedures appropriate to your environment to ensure these files are in sync across all hosts before continuing.
If the files are not in sync between hosts, the software will fail to start!
You will also need to repeat this if you have a Multi-Site/Active-Active topology for the cross-site replicators:
db1> for host in db2 db3 db4 db5 db6; do
scp /opt/replicator/share[jpt]* ${host}:/opt/replicator/share
scp /opt/replicator/share.[jpt]* ${host}:/opt/replicator/share
done
Restart all tungsten components, one host at a time
shell>manager restart
shell>replicator restart
shell>connector restart
Return the cluster(s) to AUTOMATIC
mode
Ensure you place your cluster(s) into MAINTENANCE
mode
Obtain the TAR or RPM package for your installation. If using a TAR
file unpack this into your software staging tree, typically
/opt/continuent/software
. If you use the INI
install method, this needs to be performed on every host. For
staging install, this applies to the staging host only.
Change into the directory for the software and fetch the configuration, e.g
shell>cd /opt/continuent/software/tungsten-clustering-7.1.4-10
shell>tpm reverse > deploy.sh
If no additional steps taken, and you wish to maintain the same level of security, skip Step 5, and proceed directly to Step 6.
Edit the deploy.sh file just created, and include some, or all, of
the options below depending on which steps you took earlier (They
should be placed within the defaults
.
--disable-security-controls=false
--connector-rest-api-ssl=true
--manager-rest-api-ssl=true
--replicator-rest-api-ssl=true
--java-jgroups-keystore-path=/etc/tungsten/secure/jgroups.jceks
If "Tungsten to Database Encryption" IS configured, also add:
--datasource-enable-ssl=true
--java-truststore-path=/etc/tungsten/secure/truststore.ts
--java-truststore-password=tungsten
--java-keystore-path=/etc/tungsten/secure/keystore.jks
--java-keystore-password=tungsten
--datasource-mysql-ssl-cert=/etc/mysql/certs/client-cert.pem
--datasource-mysql-ssl-key=/etc/mysql/certs/client-key.pem
--datasource-mysql-ssl-ca=/etc/mysql/certs/ca.pem
If "Tungsten to Database Encryption" IS NOT configured, also add:
--datasource-enable-ssl=false
If "Application (Connector) to Database Encryption" IS configured, also add:
--enable-connector-ssl=true
--java-connector-keystore-path=/etc/tungsten/secure/tungsten_connector_keystore.jks
--java-connector-keystore-password=tungsten
--java-connector-truststore-path=/etc/tungsten/secure/tungsten_connector_truststore.ts
--java-connector-truststore-password=tungsten
If "Application (Connector) to Database Encryption" IS NOT configured, also add:
--enable-connector-ssl=false
If start-and-report=true
, remove
this value or set to false
An example of a BEFORE and AFTER edit including all options:
shell> cat deploy.sh
# BEFORE
tools/tpm configure defaults \
--reset \
--application-password=secret \
--application-port=3306 \
--application-user=app_user \
--disable-security-controls=true \
--install-directory=/opt/continuent \
--mysql-allow-intensive-checks=true \
--profile-script=/home/tungsten/.bash_profile \
--replication-password=secret \
--replication-user=tungsten \
--start-and-report=true \
--user=tungsten
# Options for the nyc data service
tools/tpm configure nyc \
--connectors=db1,db2,db3 \
--master=db1 \
--slaves=db2,db3 \
--topology=clustered
shell> cat deploy.sh
# AFTER
tools/tpm configure defaults \
--reset \
--application-password=secret \
--application-port=3306 \
--application-user=app_user \
--install-directory=/opt/continuent \
--mysql-allow-intensive-checks=true \
--profile-script=/home/tungsten/.bash_profile \
--replication-password=secret \
--replication-user=tungsten \
--user=tungsten \
--start-and-report=false \
--disable-security-controls=false \
--connector-rest-api-ssl=true \
--manager-rest-api-ssl=true \
--replicator-rest-api-ssl=true \
--datasource-enable-ssl=true \
--java-jgroups-keystore-path=/etc/tungsten/secure/jgroups.jceks \
--java-truststore-path=/etc/tungsten/secure/truststore.ts \
--java-truststore-password=tungsten \
--java-keystore-path=/etc/tungsten/secure/keystore.jks \
--java-keystore-password=tungsten \
--enable-connector-ssl=true \
--java-connector-keystore-path=/etc/tungsten/secure/tungsten_connector_keystore.jks \
--java-connector-keystore-password=tungsten \
--java-connector-truststore-path=/etc/tungsten/secure/tungsten_connector_truststore.ts \
--java-connector-truststore-password=tungsten \
--datasource-mysql-ssl-cert=/etc/mysql/certs/client-cert.pem \
--datasource-mysql-ssl-key=/etc/mysql/certs/client-key.pem \
--datasource-mysql-ssl-ca=/etc/mysql/certs/ca.pem
# Options for the nyc data service
tools/tpm configure nyc \
--connectors=db1,db2,db3 \
--master=db1 \
--slaves=db2,db3 \
--topology=clustered
Next, source the file to load the configuration and then execute the update:
shell>source deploy.sh
shell>tools/tpm update --replace-release
You may wish to include the optional
--no-connectors
option if you wish to control the
restart of the connectors manually
For Multi-Site/Active-Active topologies, you will also need to repeat the steps for the cross-site replicators
Finally, before returning the cluster(s) to
AUTOMATIC
, you will need to sync the new
certificates, created by the upgrade, to all hosts. This step will
be required even if you have disabled security as these files will
be used by the API and also, if you choose to enable it, THL
Encryption.
From one host, copy the certificate and keystore files to ALL other hosts in your topology. The following scp command is an example assuming you are issuing from db1, and the install directory is /opt/continuent:
db1> for host in db2 db3 db4 db5 db6; do
scp /opt/continuent/share/[jpt]* ${host}:/opt/continuent/share
scp /opt/continuent/share/.[jpt]* ${host}:/opt/continuent/share
done
The examples assume you have the ability to scp between hosts as the tungsten OS user. If your security restrictions do not permit this, you will need to use alternative procedures appropriate to your environment to ensure these files are in sync across all hosts before continuing.
If the files are not in sync between hosts, the software will fail to start!
You will also need to repeat this if you have a Multi-Site/Active-Active topology for the cross-site replicators:
db1> for host in db2 db3 db4 db5 db6; do
scp /opt/replicator/share[jpt]* ${host}:/opt/replicator/share
scp /opt/replicator/share.[jpt]* ${host}:/opt/replicator/share
done
Restart all tungsten components, one host at a time
shell>manager restart
shell>replicator restart
shell>connector restart
Return the cluster(s) to AUTOMATIC
mode
Once the upgrade has been completed, if you plan on using the API you will need to complete a few extra steps before you can use it. By default, after installation the API will only allow the ping method and the createAdminUser method.
To open up the API and access all of its features, you will need to configure the API User. To do this, execute the following on all hosts (Setting the value of pass to your preferred password):
shell> curl -k -H 'Content-type: application/json' --request POST 'https://127.0.0.1:8096/api/v2/createAdminUser?i-am-sure=true' \ > --data-raw '{ > "payloadType": "credentials", > "user":"tungsten", > "pass":"security" > }'
For more information on using the new API, please refer to Chapter 11, Tungsten REST API (APIv2)
Removing components from a dataservice is quite straightforward, usually involved both modifying the running service and changing the configuration. Changing the configuration is necessary to ensure that the host is not re-configured and installed when the installation is next updated.
In this section:
To remove a datasource from an existing deployment there are two primary stages, removing it from the active service, and then removing it from the active configuration.
For example, to remove host6
from a
service:
Check the current service state:
[LOGICAL] /alpha > ls
COORDINATOR[host1:AUTOMATIC:ONLINE]
ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[11401](ONLINE, created=17, active=0) |
|connector@host2[7998](ONLINE, created=0, active=0) |
|connector@host3[31540](ONLINE, created=0, active=0) |
|connector@host4[26829](ONLINE, created=27, active=1) |
+----------------------------------------------------------------------------+
DATASOURCES:
+----------------------------------------------------------------------------+
|host1(slave:ONLINE, progress=373, latency=0.000) |
|STATUS [OK] [2014/02/12 12:48:14 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host6, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=30, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host2(slave:ONLINE, progress=373, latency=1.000) |
|STATUS [OK] [2014/01/24 05:02:34 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host6, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host3(slave:ONLINE, progress=373, latency=1.000) |
|STATUS [OK] [2014/02/11 03:17:08 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=slave, master=host6, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=0, active=0) |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|host6(master:ONLINE, progress=373, THL latency=0.936) |
|STATUS [OK] [2014/02/12 12:39:52 PM GMT] |
+----------------------------------------------------------------------------+
| MANAGER(state=ONLINE) |
| REPLICATOR(role=master, state=ONLINE) |
| DATASERVER(state=ONLINE) |
| CONNECTIONS(created=14, active=1) |
+----------------------------------------------------------------------------+
Switch to MAINTENANCE
policy
mode:
[LOGICAL] /alpha > set policy maintenance
policy mode is now MAINTENANCE
Switch to administration mode:
[LOGICAL] /alpha > admin
Remove the node from the active service using the rm command. You will be warned that this is an expert command and to confirm the operation:
[ADMIN] /alpha > rm host6
WARNING: This is an expert-level command:
Incorrect use may cause data corruption
or make the cluster unavailable.
Do you want to continue? (y/n)> y
Switch back to logical mode:
[ADMIN] /alpha > logical
Switch to AUTOMATIC
policy
mode:
[LOGICAL] /alpha > set policy automatic
policy mode is now AUTOMATIC
Now the node has been removed from the active dataservice, the services must be stopped and then removed from the configuration.
Stop the running services:
shell> stopall
Now you must remove the node from the configuration, although the exact method depends on which installation method used with tpm:
If you are using staging directory method with tpm:
shell>tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-7.1.4-10 shell>echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten shell>echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1 shell>echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-7.1.4-10 shell>ssh {STAGING_USER}@{STAGING_HOST}
shell>cd {STAGING_DIRECTORY}
shell> ./tools/tpm configure alpha \
--connectors=host1,host2,host3,host4 \
--members=host1,host2,host3
Run the tpm command to update the software with the Staging-based configuration:
shell> ./tools/tpm update
For information about making updates when using a Staging-method deployment, please see Section 10.3.7, “Configuration Changes from a Staging Directory”.
If you are using the INI file method with tpm:
Remove the INI configuration file:
shell> rm /etc/tungsten/tungsten.ini
Stop the replicator/manager from being started again.
If this all the services on the this node, replicator, manager and connector are being removed, remove the Tungsten Cluster installation entirely:
Remove the startup scripts from your server:
shell> sudo /opt/continuent/tungsten/cluster-home/bin/undeployall
Remove the installation directory:
shell> rm -rf /opt/continuent
If the replicator/manager has been installed on a host but the connector is not being removed, remove the start scripts to prevent the services from being automatically started:
shell>rm /etc/init.d/tmanager
shell>rm /etc/init.d/treplicator
To remove an entire composite datasource (cluster) from an existing deployment there are two primary stages, removing it from the active service, and then removing it from the active configuration.
For example, to remove cluster west
from a composite dataservice:
Check the current service state:
shell>cctrl -multi
[LOGICAL] / >ls
+----------------------------------------------------------------------------+ |DATA SERVICES: | +----------------------------------------------------------------------------+ east global west [LOGICAL] / >use global
[LOGICAL] /global >ls
COORDINATOR[db1:AUTOMATIC:ONLINE] DATASOURCES: +----------------------------------------------------------------------------+ |east(composite master:ONLINE) | |STATUS [OK] [2017/05/16 01:25:31 PM UTC] | +----------------------------------------------------------------------------+ +----------------------------------------------------------------------------+ |west(composite slave:ONLINE) | |STATUS [OK] [2017/05/16 01:25:30 PM UTC] | +----------------------------------------------------------------------------+
Switch to MAINTENANCE
policy
mode:
[LOGICAL] /global > set policy maintenance
policy mode is now MAINTENANCE
Remove the composite member cluster from the composite service using the drop command.
[LOGICAL] /global >drop composite datasource west
COMPOSITE DATA SOURCE 'west@global' WAS DROPPED [LOGICAL] /global >ls
COORDINATOR[db1:AUTOMATIC:ONLINE] DATASOURCES: +----------------------------------------------------------------------------+ |east(composite master:ONLINE) | |STATUS [OK] [2017/05/16 01:25:31 PM UTC] | +----------------------------------------------------------------------------+ [LOGICAL] /global >cd /
[LOGICAL] / >ls
+----------------------------------------------------------------------------+ |DATA SERVICES: | +----------------------------------------------------------------------------+ east global
If the removed composite datasource still appears in the top-level listing, then you will need to clean up by hand. For example:
[LOGICAL] /global >cd /
[LOGICAL] / >ls
+----------------------------------------------------------------------------+ |DATA SERVICES: | +----------------------------------------------------------------------------+ east global west
Stop all managers on all nodes at the same time
[LOGICAL] /global >use west
[LOGICAL] /west >manager * stop
shell > vim $CONTINUENT_HOME/cluster-home/conf/dataservices.properties
Before:
east=db1,db2,db3
west=db4,db5,db6
After:
east=db1,db2,db3
Start all managers one-by-one, starting with the current Primary
shell > manager start
Once all managers are running, check the list again:
shell>cctrl -multi
[LOGICAL] / >ls
+----------------------------------------------------------------------------+ |DATA SERVICES: | +----------------------------------------------------------------------------+ east global
Switch to AUTOMATIC
policy
mode:
[LOGICAL] / > set policy automatic
policy mode is now AUTOMATIC
Now the cluster has been removed from the composite dataservice, the services on the old nodes must be stopped and then removed from the configuration.
Stop the running services on all nodes in the removed cluster:
shell> stopall
Now you must remove the node from the configuration, although the exact method depends on which installation method used with tpm:
If you are using staging directory method with tpm:
Change to the staging directory. The current staging directory can be located using tpm query staging:
shell>tpm query staging
tungsten@host1:/home/tungsten/tungsten-clustering-7.1.4-10 shell>cd /home/tungsten/tungsten-clustering-7.1.4-10
Update the configuration, omitting the cluster datasource name from the list of members of the dataservice:
shell> tpm update global --composite-datasources=east
If you are using the INI file method with tpm:
Remove the INI configuration file:
shell> rm /etc/tungsten/tungsten.ini
Stop the replicator/manager from being started again.
If this all the services on the this node, replicator, manager and connector are being removed, remove the Tungsten Cluster installation entirely:
Remove the startup scripts from your server:
shell> sudo /opt/continuent/tungsten/cluster-home/bin/undeployall
Remove the installation directory:
shell> rm -rf /opt/continuent
If the replicator/manager has been installed on a host but the connector is not being removed, remove the start scripts to prevent the services from being automatically started:
shell>rm /etc/init.d/tmanager
shell>rm /etc/init.d/treplicator
Removing a connector involves only stopping the connector and removing the configuration. When the connector is stopped, the manager will automatically remove it from the dataservice. Note that applications that have been configured to talk to the connector must be updated to point to another connector.
For example, to remove host4 from the current dataservice:
Login to the host running the connector.
Stop the connector service:
shell> connector stop
Remove the connector from the configuration, the exact method depends on which installation method used with tpm:
If you are using staging directory method with tpm:
Change to the staging directory. The current staging directory can be located using tpm query staging:
shell>tpm query staging
tungsten@host1:/home/tungsten/tungsten-clustering-7.1.4-10 shell>cd /home/tungsten/tungsten-clustering-7.1.4-10
Update the configuration, omitting the host from the list of members of the dataservice:
shell> tpm update alpha \
--connectors=host1,host2,host3 \
--members=host1,host2,host3
If you are using the INI file method with tpm:
Remove the INI configuration file:
shell> rm /etc/tungsten/tungsten.ini
Stop the connector from being started again. If the connector is restarted, it will connect to the previously configured Primary and begin operating again.
If this is a standalone Connector installation, remove the Tungsten Cluster installation entirely:
Remove the startup scripts from your server:
shell> sudo /opt/continuent/tungsten/cluster-home/bin/undeployall
Remove the installation directory:
shell> rm -rf /opt/continuent
If the connector has been installed on a host with replicator and/or managers, remove the start script to prevent the connector from being automatically started:
shell> rm /etc/init.d/tconnector
Table of Contents
Tungsten Cluster supports SSL, TLS and certificates for both communication and authentication for all components within the system, and to the underlying databases. This security is enabled by default and includes:
Authentication between command-line tools (cctrl), and between background services.
SSL/TLS between command-line tools and background services.
SSL/TLS between Tungsten Replicator and datasources.