Installation of the Hadoop replication consists of multiple stages:
Configure the source and target hosts following the prerequisites outlined in Appendix B, Prerequisites then follow the appropriate steps for the required extractor topology outlined in Chapter 3, Deploying MySQL Extractors.
Install the Applier replicator which will apply information to the target Hadoop environment.
Once the installation of the Extractor and Applier components have been completed, materialization of tables and views can be performed.
The applier replicator service reads information from the THL of the source and applies this to a local instance of Hadoop.
Installation must take place on a node within the Hadoop cluster. Writing to a remote HDFS filesystem is not currently supported.
Before installing the applier, the following additions need adding to the extractor configuration. Apply the following parameters, update the extractor and then install the applier
For Staging Install:
shell> cd tungsten-replicator-7.1.4-10
shell> ./tools/tpm configure alpha \
--enable-batch-service=true
shell> ./tools/tpm update
For INI Installs:
Add the following the /etc/tungsten/tungsten.ini
[alpha]
...Existing Replicator Config...
enable-batch-service=true
shell> tpm update
The applier can now be configured.
Unpack the Tungsten Replicator distribution in staging directory:
shell> tar zxf tungsten-replicator-7.1.4-10.tar.gz
Change into the staging directory:
shell> cd tungsten-replicator-7.1.4-10
Configure the installation using tpm:
shell> vi /etc/tungsten/tungsten.ini
[defaults]
user=tungsten
install-directory=/opt/continuent
profile-script=~/.bash_profile
skip-validation-check=HostsFileCheck
skip-validation-check=InstallerMasterSlaveCheck
skip-validation-check=DatasourceDBPort
skip-validation-check=DirectDatasourceDBPort
skip-validation-check=ReplicationServicePipelines
rest-api-admin-user=apiuser
rest-api-admin-pass=secret
[alpha]
master=host1
members=host2
property=replicator.datasource.global.csvType=hive
property=replicator.stage.q-to-dbms.blockCommitInterval=1s
property=replicator.stage.q-to-dbms.blockCommitRowCount=1000
replication-password=secret
replication-user=tungsten
batch-enabled=true
batch-load-language=js
batch-load-template=hadoop
datasource-type=file
Configuration group defaults
Configuration group alpha
If you plan to make full use of the REST API (which is enabled by default) you will need to also configure a username and password for API access. This must be done by specifying the following options in your configuration:
rest-api-admin-user=tungsten
rest-api-admin-pass=secret
Once the prerequisites and configuring of the installation has been completed, the software can be installed:
shell> ./tools/tpm install
If the installation process fails, check the output of the
/tmp/tungsten-configure.log
file
for more information about the root cause.
Once the service has been installed it can be monitored using the trepctl command. See Section 4.6.4.4, “Management and Monitoring of Hadoop Deployments” for more information. If there are problems during installation, see Section 4.6.4.5, “Troubleshooting Hadoop Replication”.
Show Copy-friendly TextIf not already completed, the schema generation process described in Section 4.6.2.2, “Schema Generation” should have been followed. This creates the necessary Hive schema and staging schema definitions.
Once the tables have been created through ddlscan you can query the stage tables:
hive> select * from stage_xxx_movies_large limit 10;
OK
I 10 1 57475 All in the Family 1971 Archie Feels Left Out (#4.17)
I 10 2 57476 All in the Family 1971 Archie Finds a Friend (#6.18)
I 10 3 57477 All in the Family 1971 Archie Gets the Business: Part 1 (#8.1)
I 10 4 57478 All in the Family 1971 Archie Gets the Business: Part 2 (#8.2)
I 10 5 57479 All in the Family 1971 Archie Gives Blood (#1.4)
I 10 6 57480 All in the Family 1971 Archie Goes Too Far (#3.17)
I 10 7 57481 All in the Family 1971 Archie in the Cellar (#4.10)
I 10 8 57482 All in the Family 1971 Archie in the Hospital (#3.15)
I 10 9 57483 All in the Family 1971 Archie in the Lock-Up (#2.3)
I 10 10 57484 All in the Family 1971 Archie Is Branded (#3.20)