6.1.4.1. Setting up the Source Oracle Environment

The primary stage in configuring Oracle to MySQL replication is to configure the Oracle environment and databases ready for use as a data source by the Tungsten Replicator. A script, setupCDC.sh automates some of the processes behind the initial configuration and is responsible for creating the required Change Data Capture tables that will be used to capture the data change information.

Before running setupCDC.sh, the following steps must be completed.

  • Ensure that Oracle is configured to accept dates in YYYY-MM-DD format used by Tungsten Replicator:

    
    SQL> ALTER SYSTEM SET NLS_DATE_FORMAT='YYYY-MM-DD' SCOPE=SPFILE;

    Then restart the database for the change to take effect:

    
    SQL> shutdown immediate
    SQL> startup
  • Create the source user and schema if it does not already exist.

If the installation process fails, check the output of the /tmp/tungsten-configure.log file for more information about the root cause.

Once these steps have been completed, a configuration file must be created that defines the CDC configuration.

Table 6.3. setupCDC.conf Configuration File Parameters

Variable Sample Value Description
service   The name of the service that will be used to process these events. It should match the name of the schema from which data is being read. The name should also match the name of the service that will be created using Tungsten Replicator to extract events from Oracle.
sys_user SYSDBA The name of the SYSDBA user configured. The default (if not specified) is SYSDBA.
sys_pass   The password of the SYSDBA user; you will be prompted for this information if it has not been added.
enable_archivelog 0 If set to true, the Oracle instance will be configured to enable archive logging (required by Oracle CDC), and then the Oracle instance will be restarted.
source_user   The name of the source schema user that will be used to identify the tables used to build the publish tables. This user is created by the setupCDC.sh script.
pub_tablespace 0 By default, the system tablespace is used for holding the publisher tables. Using the system tablespace should only be used during testing, as the tablespace is typically not large enough to hold the required change data. If set to 1, use the created tablespace (matching the value of pub_user) which is assumed to be large enough to hold the change information.
pub_user   The publisher user that will be created to publish the CDC views.
pub_password   The publisher password that will be used when the publisher user is created.
tungsten_user tungsten The subscriber user that will be created to access the CDC views. This will be used as the datasource username within the Tungsten Replicator configuration.
tungsten_pwd password The subscriber password that will be created to access the CDC. This will be used as the datasource username within the Tungsten Replicator configuration. views.
delete_publisher   If set to 1, the publisher user will be deleted before being recreated.
delete_subscriber   If set to 1, the subscriber user will be deleted before being recreated.
cdc_type SYNC_SOURCE Specifies the CDC extraction type to be deployed. Using SYNC_SOURCE uses synchronous capture; HOTLOG_SOURCE uses asynchronous capture.
specific_tables   If set to 1, limits the replication to only use the tables listed in a tungsten.tables file. If set to 0, no file is used and all tables are included.
specific_path   The path of the tungsten.tables file. When using Oracle RAC, the location of the tungsten.tables file must be in a shared location accessible by Oracle RAC. If not specified, the current directory is used.

A sample configuration file is provided in tungsten-replicator/scripts/setupCDC.conf within the distribution directory.

To configure the CDC configuration:

  1. For example, the following configuration would setup CDC for replication from the sales schema (comment lines have been removed for clarity):

    service=SALES
    sys_user=sys
    sys_pass=oracle
    enable_archive_log=0
    export source_user=sales
    pub_tablespace=0
    pub_user=${source_user}_pub
    pub_password=password
    tungsten_user=tungsten
    tungsten_pwd=password
    delete_publisher=0
    delete_subscriber=0
    cdc_type=HOTLOG_SOURCE
    specific_tables=0
    specific_path=
  2. Before running setupCDC.sh, and if you have set the pub_tablespace variable to 1, you must create the tablespace that will be used to hold the CDC data. This needs to be created only once:

    
    shell> sqlplus sys/oracle as sysdba
    SQL> CREATE TABLESPACE "SALES_PUB" DATAFILE '/oracle/SALES_PUB' SIZE 10485760 AUTOEXTEND ON NEXT 
       1048576 MAXSIZE 32767M NOLOGGING ONLINE PERMANENT BLOCKSIZE 8192 EXTENT MANAGEMENT LOCAL AUTOALLOCATE
       DEFAULT NOCOMPRESS SEGMENT SPACE MANAGEMENT AUTO;

    The above SQL statement is all one statement. The tablespace name and data file locations should be modified according to the pub_user values used in the configuration file. Note that the directory specified for the data file must exist, and must be writable by Oracle.

  3. Once the configuration file has been created, run setupCDC.sh with the configuration file (it defaults to setupCDC.conf). The command must be executed within the tungsten-replicator/scripts within the distribution (or installation) directory, as it relies on SQL scripts in that directory to operate:

    shell> cd tungsten-replicator-5.0.0-0/tungsten-replicator/scripts
    shell> ./setupCDC.sh custom.conf
    Using configuration custom.conf
    Configuring CDC for service 'SALES' for Oracle 11. Change Set is 'TUNGSTEN_CS_SALES'
    Removing old CDC installation if any (SYSDBA)
    Done.
    Setup tungsten_load (SYSDBA)
    Done.
    Creating publisher/subscriber and preparing table instantiation (SYSDBA)
    Done.
    Setting up HOTLOG_SOURCE (SALES_PUB)
    Oracle version : 11.2.0.2.0
    Setting Up Asynchronous Data Capture TUNGSTEN_CS_SALES
    Processing SALES.SAMPLE -> 'CT_SAMPLE' : OK
    Enabling change set : TUNGSTEN_CS_SALES
    Dropping view TUNGSTEN_PUBLISHED_COLUMNS
    Dropping view TUNGSTEN_SOURCE_TABLES
    
    PL/SQL procedure successfully completed.
    
    Done.
    adding synonym if needed (tungsten)
    Done.
    Cleaning up (SYSDBA)
    Done.
    Capture started at position 16610205

    The script will report the current CDC archive log position where extraction will start.

    If there are error, the problem with the script and setup will be reported. The problem should be corrected, and the script executed again until it completes successfully.