The following Staging-method procedure will install the
Tungsten Replicator software onto target node host6
,
extracting from a cluster consisting of three (3) nodes
(host1
, host2
and
host3
) and applying into the target datawarehouse via
host6
.
If you are replicating to a MySQL-specific target, please see Section 3.9, “Replicating Data Out of a Cluster” for more information.
On your staging server, go to the software directory.
shell> cd /opt/continuent/software
Download the latest Tungsten Replicator version.
Unpack the release package
shell> tar xvzf tungsten-replicator-7.0.3-141.tar.gz
Change to the unpackaged directory:
shell> cd tungsten-replicator-7.0.3-141.tar.gz
Execute the tpm command to configure defaults for the installation.
shell> ./tools/tpm configure defaults \
--install-directory=/opt/replicator \
'--profile-script=~/.bashrc' \
--replication-password=secret \
--replication-port=13306 \
--replication-user=tungsten \
--start-and-report=true \
--mysql-allow-intensive-checks=true \
--user=tungsten
The description of each of the options is shown below; click the icon to hide this detail:
This runs the tpm command. configure defaults indicates that we are setting options which will apply to all dataservices.
--install-directory=/opt/replicator
The installation directory of the Tungsten service. This is where the service will be installed on each server in your dataservice.
The profile script used when your shell starts. Using this line modifies your profile script to add a path to the Tungsten tools so that managing Tungsten Cluster™ are easier to use.
The operating system user name that you have created for the
Tungsten service,
tungsten
.
The user name that will be used to apply replication changes to the database on Replicas.
--replication-password=password
The password that will be used to apply replication changes to the database on Replicas.
Set the port number to use when connecting to the MySQL server.
Tells tpm to startup the service, and report the current configuration and status.
Configure a cluster alias that points to the Primaries and Replicas within the current Tungsten Cluster service that you are replicating from:
shell> ./tools/tpm configure alpha \
--master=host1 \
--slaves=host2,host3 \
--thl-port=2112 \
--topology=cluster-alias
The description of each of the options is shown below; click the icon to hide this detail:
This runs the tpm command.
configure indicates that we
are creating a new dataservice, and
alpha
is the name of the
dataservice being created.
This definition is for a dataservice alias, not an actual
dataservice because
--topology=cluster-alias
has
been specified. This alias is used in the cluster-slave section
to define the source hosts for replication.
Specifies the hostname of the default Primary in the cluster.
Specifies the name of any other servers in the cluster that may be replicated from.
The THL port for the cluster. The default value is 2112 but any other value must be specified.
Define this as a cluster dataservice alias so tpm does not try to install cluster software to the hosts.
This dataservice
cluster-alias
name MUST be
the same as the cluster dataservice name that you are replicating
from.
On the Cluster-Extractor node, copy the
convertstringfrommysql.json
filter
configuration sample file into the
/opt/replicator/share
directory then edit it to
suit:
cp /opt/replicator/tungsten/tungsten-replicator/support/filters-config/convertstringfrommysql.json /opt/replicator/share/
vi /opt/replicator/share/convertstringfrommysql.json
Once the
convertstringfrommysql
JSON
configuration file has been edited, update the
/etc/tungsten/tungsten.ini
file to add and
configure any addition options needed for the specific datawarehouse
you are using.
Create the configuration that will replicate from cluster
dataservice alpha
into the
database on the host specified by
--relay=host6
:
shell> ./tools/tpm configure omega \
--relay=host6 \
--relay-source=alpha \
--repl-svc-remote-filters=convertstringfrommysql \
--property=replicator.filter.convertstringfrommysql.definitionsFile=/opt/replicator/share/convertstringfrommysql.json \
--topology=cluster-slave
The description of each of the options is shown below; click the icon to hide this detail:
This runs the tpm command.
configure indicates that we
are creating a new replication service, and
omega
is the unique
service name for the replication stream from the cluster.
Specifies the hostname of the destination database into which data will be replicated.
Specifies the name of the source cluster dataservice alias (defined above) that will be used to read events to be replicated.
Read source replication data from any host in the
alpha
dataservice.
Now finish configuring the
omega
dataservice with the
options specific to the datawarehouse target in use.
AWS RedShift Target
shell> ./tools/tpm configure omega \
--batch-enabled=true \
--batch-load-template=redshift \
--enable-heterogeneous-slave=true \
--datasource-type=redshift \
--replication-host=REDSHIFT_ENDPOINT_FQDN_HERE \
--replication-user=REDSHIFT_PASSWORD_HERE \
--replication-password=REDSHIFT_PASSWORD_HERE \
--redshift-dbname=REDSHIFT_DB_NAME_HERE \
--svc-applier-filters=dropstatementdata \
--svc-applier-block-commit-interval=10s \
--svc-applier-block-commit-size=5
The description of each of the options is shown below; click the icon to hide this detail:
Configures default options that will be configured for all future services.
Configure the topology as a cluster-slave. This will configure the individual replicator as ac Extractor of all the nodes in the cluster, as defined in the previous configuration of the cluster topology.
Configure the node as the relay for the cluster which will replicate data into the datawarehouse.
Configures the Extractor to correctly process the incoming data so that it can be written to the datawarehouse. This includes correcting the processing of text data types and configuring the appropriate filters.
The target host for writing data. In the case of Redshift, this is the fully qualified hostname of the Redshift host.
The user within the Redshift service that will be used to write data into the database.
--replication-password=password
The password for the user within the Redshift service that will be used to write data into the database.
Set the datasource type to be used when storing information about the replication state.
Enable the batch service, this configures the JavaScript batch engine and CSV writing semantics to generate the data to be applied into a datawarehouse.
--batch-load-template=redshift
The batch load template to be used. Since we are replicating
into Redshift, the
redshift
template is used.
The name of the database within the Redshift service where the data will be written.
Please see Install Amazon Redshift Applier for more information.
Vertica Target
shell> ./tools/tpm configure omega \
--batch-enabled=true \
--batch-load-template=vertica6 \
--batch-load-language=js \
--datasource-type=vertica \
--disable-relay-logs=true \
--enable-heterogeneous-service=true \
--replication-user=dbadmin \
--replication-password=VERTICA_DB_PASSWORD_HERE \
--replication-host=VERTICA_HOST_NAME_HERE \
--replication-port=5433 \
--svc-applier-block-commit-interval=5s \
--svc-applier-block-commit-size=500 \
--vertica-dbname=VERTICA_DB_NAME_HERE
Please see Install Vertica Applier for more information.
For additional targets, please see the full list at Deploying Appliers, or click on some of the targets below:
Once the configuration has been completed, you can perform the installation to set up the Tungsten Replicator services using the tpm command run from the staging directory:
shell> ./tools/tpm install
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file
for more information about the root cause.
The Cluster-Extractor replicator should now be installed and ready to use.