7.10.2. Seeding Data for Heterogeneous Replication

7.10.2. Seeding Data for Heterogeneous Replication
Prev	^Up	7.10. Migrating and Seeding Data	Next

7.10.2. Seeding Data for Heterogeneous Replication

7.10.2.1. Seeding Data from a Standalone Source
7.10.2.2. Seeding Data from a Cluster, for a Cluster-Extractor Target

Seeding data for heterogeneous targets is a complex process that can have many challenges that all depend on the target and the amount of data requiring seeding.

The steps outlined below come with their own challenges that may prove not to be suitable in your own environment. Therefore, any pre-seeding process needs to be fully understood and evaluated

The following requirements are needed for this process:

A temporary host meeting all required pre-requisites, with an empty MySQL instance matching the same version as the original source, and with the default storage engine set to BLACKHOLE
Enough disk space on the temporary host to hold binary logs and THL logs equivalent to the amount of data being seeded
The ability to extract the data from the source, using mysqldump, or via reverse engineering SQL Statements

The process in summary is as follows:

Configure replicator to extract from the temporary instance
Extract data from the source capturing the binlog position appropriate to the export
Import the data into the temporary instance and allow the replicator to load the target
Re-Configure the replicator to extract from the original source, positioned to start from the co-ordinates noted during the export

7.10.2.1. Seeding Data from a Standalone Source

Step 1: Configure the Temporary Instance

Build an empty MySQL Instance providing that all the pre-requisites are in place. These are outline in Appendix B, Prerequisites
To ensure compatibility, you need to make sure that the version of MySQL used matches the version of MySQL running on the main source.
Once the instance is running, you need to pre-create all of the tables that you will be loading. This must be done manually as you need to ensure that each table is created with the ENGINE=BLACKHOLE option
The use of the BLACKHOLE engine will mean that the data doesn't actually get stored when written to the database, however binary logs are generated and it is these that the replicator requires

Step 2: Configure the Replicator

Follow the steps outlined in Section 3.2, “Deploying a Primary/Replica Topology” to configure an extractor against the Temporary host.
Ensure that the configuration includes the following entries to enable heterogeneous replication, and ensure you qualify the objects (schemas and/or tables) that you want to seed
```
enable-heterogeneous-master=true
svc-extractor-filters=replicate
property=replicator.filter.replicate.do=schema.table
```
Once installed, start the replicator

Step 3: Build the Target Schema

Depending on the target, you may need to pre-create the final objects in your target environment. Use ddlscan to do this now if this is required

Step 4: Configure the Applier

Ensure the applier host meets all the required pre-requisites and then configure the applier appropriate to the target you are applying to.
Follow the appropriate steps in Chapter 4, Deploying Appliers to configure a standalone applier, ensuring that it is configured to connect to the temporary extractor installed in Step 2
Once configured, start the applier. At this stage you should not see any replication traffic since the temporary host will have no data written to it

Step 5: Export the data from the source

We now need to export the data from the source.
Using mysqldump we need to ensure we capture the binlog position and we also need to ensure that the export does NOT contain DB or TABLE DDL
The following example can be used as a template. In this example we are exporting an entire schema. Use the appropriate options if you require only specific tables
```
mysqldump -u root -psecret -B hr --no-create-db --no-create-info --master-data=2 >dump.sql
```

Step 6: Import the Data

Now that we have the data we can import this into the Temporary Instance
As you load the data, you can monitor replication and you should see the data loading into your target envrionment.
Providing you created the tables correctly with the BLACKHOLE engine, you should see that a select count on the tables in the temporary instance should return a row count of zero.
When the load has finished and the applier has completed replication, stop both replicators using the following command:
copy
```
shell> replicator stop
```
We have now finished with the temporary MySQL instance

Step 7: Install Extractor from Main Source Host

Follow the steps outlined in Section 3.2, “Deploying a Primary/Replica Topology” to configure an extractor against the Source host.
Ensure that the configuration includes the following entries to enable heterogeneous replication, and ensure you qualify the objects (schemas and/or tables) as required, for example
```
enable-heterogeneous-master=true
svc-extractor-filters=replicate
property=replicator.filter.replicate.do=schema.table
```
Once installed, start the replicator in an offline state
copy
```
shell> replicator start offline
```

Step 8: Reconfigure the Applier

We now need to reconfigure the applier, but first we need to uninstall the software to ensure we have a clean build and any THL from the pre-load has been cleared
copy
```
shell> cd /opt/continuent/software/tungsten-replicator-7.1.4-10
shell> tools/tpm uninstall --i-am-sure
```
Place the correct configuration into the /etc/tungsten/tungsten.ini file ensuring that start-and-report=false is set and that the applier is now configured to point to the main extractor

Install the software

copy

shell> cd /opt/continuent/software/tungsten-replicator-7.1.4-10
shell> tools/tpm install

Finally start the repicator in an offline state, and issue a reset to be sure the previous tracking schema is clean:
copy
```
shell> replicator start offline
shell> trepctl -service servicename reset -all -y
```

Step 9: Position the Replicator

The final step is to now position the extractor to pick up from the position that the export in step 5 was taken.
Locate the dump file and issue the following command:
copy
```
shell> grep "CHANGE MASTER" dump.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=847;
```
Taking the sequence value from MASTER_LOG_FILE filename and the MASTER_LOG_POS, issue the following statement on the EXTRACTOR host:
copy
```
shell> trepctl online -from-event 000003:847
```
Once the command has completed, the extractor will be online, you can now bring the applier online
copy
```
shell> trepctl online
```

7.10.2.2. Seeding Data from a Cluster, for a Cluster-Extractor Target

Step 1: Configure the Temporary Instance

Build an empty MySQL Instance providing that all the pre-requisites are in place. These are outline in Appendix B, Prerequisites
To ensure compatibility, you need to make sure that the version of MySQL used matches the version of MySQL running on the main source.
Once the instance is running, you need to pre-create all of the tables that you will be loading. This must be done manually as you need to ensure that each table is created with the ENGINE=BLACKHOLE option
The use of the BLACKHOLE engine will mean that the data doesn't actually get stored when written to the database, however binary logs are generated and it is these that the replicator requires

Step 2: Configure the Replicator

Follow the steps outlined in Section 3.2, “Deploying a Primary/Replica Topology” to configure an extractor against the Temporary host.
Ensure that the configuration includes the following entries to enable heterogeneous replication, and ensure you qualify the objects (schemas and/or tables) that you want to seed
```
enable-heterogeneous-master=true
svc-extractor-filters=replicate
property=replicator.filter.replicate.do=schema.table
```
Once installed, start the replicator

Step 3: Build the Target Schema

Depending on the target, you may need to pre-create the final objects in your target environment. Use ddlscan to do this now if this is required

Step 4: Configure the Applier

In a Cluster-Extractor environment, we will use the same applier after we have seeded the target.

Ensure the applier host meets all the required pre-requisites and then configure the applier appropriate to the target you are applying to.
Follow the appropriate steps in Chapter 4, Deploying Appliers to configure a standalone applier, ensuring that it is configured to connect to the temporary extractor installed in Step 2. At this stage do not follow the Cluster-Extractor setup
Important
When setting the service name for this temporary seeding process, ensure the servicename you choose matches the servicename of the main source cluster that we will later connect to for normal operation
Once configured, start the applier. At this stage you should not see any replication traffic since the temporary host will have no data written to it

Step 5: Export the data from the source

We now need to export the data from the source. There are two ways to do this with a cluster. We can either take an export from the Primary, or we an take an export from a Replica.
Export from a Primary:
- If you are able to export from the Primary, then using mysqldump we need to ensure we capture the binlog position and we also need to ensure that the export does NOT contain DB or TABLE DDL
- The following example can be used as a template. In this example we are exporting an entire schema. Use the appropriate options if you require only specific tables
```
mysqldump -u root -psecret -B hr --no-create-db --no-create-info --master-data=2 >dump.sql
```
Export from a Replica:
- To export from a Replica, we cannot obtain the binlog position correctly as the one we need is specific to the Primary, however, exporting from a Replica means that we can utilise the features of Tungsten Cluster to isolate the node
- First, set the cluster to MAINTENANCE mode, and then SHUN the node that we wish to export from
  copy
```
shell> cctrl
cctrl> set policy maintenance
cctrl> datasource Replicahost shun
```
- Next, we can take the export
```
mysqldump -u root -psecret -B hr --no-create-db --no-create-info >dump.sql
```
- The final step is to capture the current replication position using dsctl
  copy
```
shell> dsctl get -ascmd
dsctl set -seqno 9 -epoch 2 -event-id "mysql-bin.000003:0000000000002608;-1" -source-id "db1"
```
  Make a note of the output from running this command as we will need it later
- Finally, you can re-introduce the node back into the cluster
  copy
```
shell> cctrl
cctrl> datasource Replicahost recover
cctrl> set policy automatic
```

Step 6: Import the Data

Now that we have the data we can import this into the Temporary Instance
As you load the data, you can monitor replication and you should see the data loading into your target envrionment.
Providing you created the tables correctly with the BLACKHOLE engine, you should see that a select count on the tables in the temporary instance should return a row count of zero.
When the load has finished and the applier has completed replication, stop both replicators using the following command:
copy
```
shell> replicator stop
```
We have now finished with the temporary MySQL instance

Step 7: Reconfigure the Applier as a Cluster-Extractor

We now need to reconfigure the applier as a cluster Replica, but first we need to uninstall the software to ensure we have a clean build and any THL from the pre-load has been cleared
copy
```
shell> cd /opt/continuent/software/tungsten-replicator-7.1.4-10
shell> tools/tpm uninstall --i-am-sure
```
Place the correct configuration into the /etc/tungsten/tungsten.ini file ensuring that start-and-report=false is set

Install the software

copy

shell> cd /opt/continuent/software/tungsten-replicator-7.1.4-10
shell> tools/tpm install

Finally start the repicator in an offline state:
copy
```
shell> replicator start offline
```

Step 8: Position the Replicator

The final step is to now position the replicator to pick up from the position that the export in step 5 was taken.
If you took the export from the Replica and already have a dsctl set command, then move onto the next step. If you took an export from the Primary then we need to retrieve the correct positions from the dump file
Locate the dump file and issue the following command:
copy
```
shell> grep "CHANGE MASTER" dump.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=847;
```
Taking the sequence value from MASTER_LOG_FILE filename and the MASTER_LOG_POS, issue the following statement on the Primary host in the cluster, and copy the resulting dsctl command:
copy
```
shell> tungsten_find_position mysql-bin.000003:847
dsctl set -reset -seqno 9 -epoch 2 -event-id "mysql-bin.000003:0000000000000847;-1" -source-id "db1"
```
Taking the dsctl that you obtained either from the Replica in Step 5 or from the steps just above, run the dsctl command on the applier host ONLY
Once the command has completed, you can now bring the replicator online
copy
```
shell> replicator start offline
```

Prev	Up	Next
7.10.1. Migrating from MySQL Native Replication 'In-Place'	^Level	7.11. Switching Primary Hosts

Continuent Documentation