4.2.3. Install Amazon Redshift Applier

4.2.3. Install Amazon Redshift Applier
Prev	^Up	4.2. Deploying the Amazon Redshift Applier	Next

4.2.3. Install Amazon Redshift Applier

Replication into Redshift requires two separate replicator installations, one that extracts information from the source database, and a second that generates the CSV files, loads those files into S3 and then executes the statements on the Redshift database to import the CSV data and apply the transformations to build the final tables.

The two replication services can operate on the same machine, (See Section 5.2, “Deploying Multiple Replicators on a Single Host”) or they can be installed on two different machines.

Once you have completed the configuration of the Amazon Redshift database, you can configure and install the applier as described using the steps below.

Before installing the applier, the following additions need adding to the extractor configuration. Apply the following parameter to the extractor configuration before installing the applier
Add the following the /etc/tungsten/tungsten.ini
```
[alpha]
...Existing Replicator Config...
enable-heterogeneous-service=true

shell> tpm update
```
Note
The above step is only applicable for standalone extractors. If you are configuring replications from an existing Tungsten Cluster (Cluster-Extractor), follow the steps outlined here to ensure the cluster is configured correctly: Section 3.4.1, “Prepare: Replicating Data Out of a Cluster”
The applier can now be configured. Unpack the Tungsten Replicator distribution in staging directory:
```
shell> tar zxf tungsten-replicator-5.4.1-41.tar.gz
```
Change into the staging directory:
```
shell> cd tungsten-replicator-5.4.1-41
```
Configure the installation using tpm:
Show Staging
Show INI
```
shell> ./tools/tpm configure defaults \
    --reset \
    --user=tungsten \
    --install-directory=/opt/continuent \
    --profile-script=~/.bash_profile \
    --rest-api-admin-user=apiuser \
    --rest-api-admin-pass=secret

shell> ./tools/tpm configure alpha \
    --topology=master-slave \
    --master=sourcehost \
    --members=localhost \
    --datasource-type=redshift \
    --replication-host=redshift.us-east-1.redshift.amazonaws.com \
    --replication-user=awsRedshiftUser \
    --replication-password=awsRedshiftPass \
    --redshift-dbname=dev \
    --batch-enabled=true \
    --batch-load-template=redshift \
    --svc-applier-filters=dropstatementdata \
    --svc-applier-block-commit-interval=30s \
    --svc-applier-block-commit-size=250000
```
```
shell> vi /etc/tungsten/tungsten.ini
```
```
[defaults]
user=tungsten
install-directory=/opt/continuent
profile-script=~/.bash_profile
rest-api-admin-user=apiuser
rest-api-admin-pass=secret

[alpha]
topology=master-slave
master=sourcehost
members=localhost
datasource-type=redshift
replication-host=redshift.us-east-1.redshift.amazonaws.com
replication-user=awsRedshiftUser
replication-password=awsRedshiftPass
redshift-dbname=dev
batch-enabled=true
batch-load-template=redshift
svc-applier-filters=dropstatementdata
svc-applier-block-commit-interval=30s
svc-applier-block-commit-size=250000
```
Configuration group defaults
The description of each of the options is shown below; click the icon to hide this detail:
Click the icon to show a detailed description of each argument.
- --reset
  reset
  For staging configurations, deletes all pre-existing configuration information between updating with the new configuration values.
- --user=tungsten
  user=tungsten
  System User
- --install-directory=/opt/continuent
  install-directory=/opt/continuent
  Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
- --profile-script=~/.bash_profile
  profile-script=~/.bash_profile
  Append commands to include env.sh in this profile script
- --rest-api-admin-user=apiuser
  rest-api-admin-user=apiuser
  Optional: Must be specified along with rest-api-admin-pass if you wish to access the full API features and use the Dashboard GUI for cluster installations.
- --rest-api-admin-pass=secret
  rest-api-admin-pass=secret
  Optional: Must be specified along with rest-api-admin-user if you wish to access the full API features.
Configuration group alpha
The description of each of the options is shown below; click the icon to hide this detail:
Click the icon to show a detailed description of each argument.
- --topology=master-slave
  topology=master-slave
  Replication topology for the dataservice.
- --master=sourcehost
  master=sourcehost
  The hostname of the primary (extractor) within the current service.
- --members=localhost
  members=localhost
  Hostnames for the dataservice members
- --datasource-type=redshift
  datasource-type=redshift
  Database type
- --replication-host=redshift.us-east-1.redshift.amazonaws.com
  replication-host=redshift.us-east-1.redshift.amazonaws.com
  Hostname of the datasource where the database is located. If the specified hostname matches the current host or member name, the database is assumed to be local. If the hostnames do not match, extraction is assumed to be via remote access. For MySQL hosts, this configures a remote replication Replica (relay) connection.
- --replication-user=awsRedshiftUser
  replication-user=awsRedshiftUser
  For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
- --replication-password=awsRedshiftPass
  replication-password=awsRedshiftPass
  The password to be used when connecting to the database using the corresponding --replication-user.
- --redshift-dbname=dev
  redshift-dbname=dev
  Name of the Redshift database to replicate into
- --batch-enabled=true
  batch-enabled=true
  Should the replicator service use a batch applier
- --batch-load-template=redshift
  batch-load-template=redshift
  Value for the loadBatchTemplate property
- --svc-applier-filters=dropstatementdata
  svc-applier-filters=dropstatementdata
  Replication service applier filters
- --svc-applier-block-commit-interval=30s
  svc-applier-block-commit-interval=30s
  Minimum interval between commits
- --svc-applier-block-commit-size=250000
  svc-applier-block-commit-size=250000
  Applier block commit size (min 1)

If your MySQL source is a Tungsten Cluster, ensure the additional steps below are also included in your applier configuration

First, prepare the required filter configuration file as follows on the Redshift applier host(s) only:

shell> mkdir -p /opt/continuent/share/
shell> cp tungsten-replicator/support/filters-config/convertstringfrommysql.json /opt/continuent/share/

Then, include the following parameters in the configuration

property=replicator.stage.remote-to-thl.filters=convertstringfrommysql
property=replicator.filter.convertstringfrommysql.definitionsFile=/opt/continuent/share/convertstringfrommysql.json

Once the prerequisites and configuring of the installation has been completed, the software can be installed:
```
shell> ./tools/tpm install
```

If the installation process fails, check the output of the /tmp/tungsten-configure.log file for more information about the root cause.

On the host that is loading data into Redshift, create the s3-config-servicename.json file and then copy that file into the share directory within the installed directory on that host. For example:

shell> cp s3-config-servicename.json /opt/continuent/share/

Now the services can be started:

shell> replicator start

Once the service is configured and running, the service can be monitored as normal using the trepctl command. See Section 4.2.6, “Management and Monitoring of Amazon Redshift Deployments” for more information.

Prev	Up	Next
4.2.2. Preparing for Amazon Redshift Replication	^Level	4.2.4. Verifying your Redshift Installation

Continuent Documentation

4.2.3. Install Amazon Redshift Applier

Note