5.1.1. Preparing for Cassandra Replication

Preparing the hosts for the replication process requires setting some key configuration parameters within the MySQL server to ensure that data is stored and written correctly. On the Cassandra side, the database and schema must be created using the existing schema definition so that the databases and tables exist within Cassandra.

Configure the source and target hosts following the prerequisites outlined in Appendix B, Prerequisites then follow the appropriate steps for the required extractor topology outlined in Chapter 3, Deploying MySQL Extractors for MySQL extraction, or Chapter 4, Deploying Oracle Extractors for Oracle CDC extraction.

Cassandra Host

On the Cassandra host, minimal configuration is required. Since there is often no user or authentication support, the main requirements are that Cassandra has been installd, and the cqlsh command is available.

The Cassandra applier also makes use of a Ruby component, but since Ruby is a requirement for all Tungsten Clustering installations, Ruby should already exist.

  • Download and install the Datastax Cassandra Ruby driver. Ruby Gems can be used for this:

    shell> gem install cassandra-driver
  • You need to create tables within Cassandra according to the databases and tables that need to be replicated; the tables are not automatically created for you. From a Tungsten Clustering deployment directory, the ddlscan command can be used to identify the existing tables, and create table definitions for use within Cassandra.

    To use ddlscan, the template for Cassandra must be specified, along with the user/password information to connect to the source database to collect the schema definitions. The tool should be run from the templates directory.

    The tool will need to be executed twice, the first time generates the live table definitions:

    shell> cd tungsten-replicator-5.3.6-24
    shell> cd tungsten-replicator/support/ddlscan/
    shell> ddlscan -user tungsten -url 'jdbc:mysql:thin://host1:13306/test' -pass password \
        -template ddl-mysql-cassandra.vm -db test
    SQL generated on Wed Jul 12 22:37:41 BST 2017 by ./ddlscan utility of Tungsten
    url = jdbc:mysql:thin://ubuntuheterosrc:3306/test
    user = root
    dbName = test
    CREATE KEYSPACE IF NOT EXISTS "test" WITH REPLICATION = {       'class' : 'SimpleStrategy', 'replication_factor' : 3 } ;
    DROP TABLE "test"."all_mysql_types";
    CREATE TABLE "test"."all_mysql_types"
      "my_id" INT ,
      "my_bit" BOOLEAN /* BIT(1) */ ,
      "my_tinyint" INT ,
      "my_boolean" INT ,
      "my_smallint" INT ,
      "my_mediumint" INT ,
      "my_int" INT ,
      "my_bigint" BIGINT ,
      "my_decimal_10_5" DECIMAL ,
      "my_float" FLOAT ,
      "my_double" DOUBLE ,
      "my_date" DATE ,
      "my_datetime" TIMESTAMP ,
      "my_timestamp" TIMESTAMP ,
      "my_time" TIME ,
      "my_year" DECIMAL ,
      "my_char_10" TEXT ,
      "my_varchar_10" TEXT ,
      "my_tinytext" TEXT ,
      "my_text" TEXT ,
      "my_mediumtext" TEXT ,
      "my_longtext" TEXT ,
      "my_enum_abc" TEXT ,
      "my_set_def" TEXT  
          , primary key ("my_id" ))

    The output should be redirected to a file and then used to create tables within Cassandra:

    shell> ddlscan -user tungsten -url 'jdbc:mysql:thin://host1:13306/test' -pass password \
        -template ddl-mysql-cassandra.vm -db test >test.ddl

    The output of the command should be checked to ensure that the table definitions are correct.

    The file can then be applied to Cassandra:

    shell> cat test.ddl | cqlsh

    This generates the table definitions for live data. The process should be repeated to create the table definitions for the staging data by using te staging template:

    shell> ddlscan -user tungsten -url 'jdbc:mysql:thin://host1:13306/test' -pass password \
        -template ddl-mysql-cassandra-staging.vm -db test >test.ssl-staging

    Then applied to Cassandra:

    shell> cat test.ddl-staging | cqlsh

    The process should be repeated for each database that will be replicated.

Once the preparation of the MySQL and Cassandra databases are ready, you can proceed to installing Tungsten Clustering