Seeding data for heterogeneous targets is a complex process that can have many challenges that all depend on the target and the amount of data requiring seeding.
The steps outlined below come with their own challenges that may prove not to be suitable in your own environment. Therefore, any pre-seeding process needs to be fully understood and evaluated
The following requirements are needed for this process:
A temporary host meeting all required pre-requisites, with an empty MySQL instance
matching the same version as the original source, and with the default storage engine
set to BLACKHOLE
Enough disk space on the temporary host to hold binary logs and THL logs equivalent to the amount of data being seeded
The ability to extract the data from the source, using mysqldump, or via reverse engineering SQL Statements
The process in summary is as follows:
Configure replicator to extract from the temporary instance
Extract data from the source capturing the binlog position appropriate to the export
Import the data into the temporary instance and allow the replicator to load the target
Re-Configure the replicator to extract from the original source, positioned to start from the co-ordinates noted during the export
Step 1: Configure the Temporary Instance
Build an empty MySQL Instance providing that all the pre-requisites are in place. These are outline in Appendix B, Prerequisites
To ensure compatibility, you need to make sure that the version of MySQL used matches the version of MySQL running on the main source.
Once the instance is running, you need to pre-create all of the tables that you
will be loading. This must be done manually as you need to ensure that each
table is created with the ENGINE=BLACKHOLE
option
The use of the BLACKHOLE
engine will mean that the data doesn't actually get stored
when written to the database, however binary logs are generated and it is these that
the replicator requires
Step 2: Configure the Replicator
Follow the steps outlined in Section 3.2, “Deploying a Primary/Replica Topology” to configure an extractor against the Temporary host.
Ensure that the configuration includes the following entries to enable heterogeneous replication, and ensure you qualify the objects (schemas and/or tables) that you want to seed
enable-heterogeneous-master=true
svc-extractor-filters=replicate
property=replicator.filter.replicate.do=schema.table
Once installed, start the replicator
Step 3: Build the Target Schema
Depending on the target, you may need to pre-create the final objects
in your target environment. Use ddlscan
to do this now
if this is required
Step 4: Configure the Applier
Ensure the applier host meets all the required pre-requisites and then configure the applier appropriate to the target you are applying to.
Follow the appropriate steps in Chapter 4, Deploying Appliers to configure a standalone applier, ensuring that it is configured to connect to the temporary extractor installed in Step 2
Once configured, start the applier. At this stage you should not see any replication traffic since the temporary host will have no data written to it
Step 5: Export the data from the source
We now need to export the data from the source.
Using mysqldump we need to ensure we capture the binlog position and we also need to ensure that the export does NOT contain DB or TABLE DDL
The following example can be used as a template. In this example we are exporting an entire schema. Use the appropriate options if you require only specific tables
mysqldump -u root -psecret -B hr --no-create-db --no-create-info --master-data=2 >dump.sql
Step 6: Import the Data
Now that we have the data we can import this into the Temporary Instance
As you load the data, you can monitor replication and you should see the data loading into your target envrionment.
Providing you created the tables correctly with the BLACKHOLE
engine,
you should see that a select count
on the tables in the temporary
instance should return a row count of zero.
When the load has finished and the applier has completed replication, stop both replicators using the following command:
shell> replicator stop
We have now finished with the temporary MySQL instance
Step 7: Install Extractor from Main Source Host
Follow the steps outlined in Section 3.2, “Deploying a Primary/Replica Topology” to configure an extractor against the Source host.
Ensure that the configuration includes the following entries to enable heterogeneous replication, and ensure you qualify the objects (schemas and/or tables) as required, for example
enable-heterogeneous-master=true
svc-extractor-filters=replicate
property=replicator.filter.replicate.do=schema.table
Once installed, start the replicator in an offline state
shell> replicator start offline
Step 8: Reconfigure the Applier
We now need to reconfigure the applier, but first we need to uninstall the software to ensure we have a clean build and any THL from the pre-load has been cleared
shell>cd /opt/continuent/software/tungsten-replicator-
shell>7.1.4-10
tools/tpm uninstall --i-am-sure
Place the correct configuration into the /etc/tungsten/tungsten.ini
file
ensuring that start-and-report=false
is set and that the applier is now
configured to point to the main extractor
Install the software
shell>cd /opt/continuent/software/tungsten-replicator-
shell>7.1.4-10
tools/tpm install
Finally start the repicator in an offline state, and issue a reset to be sure the previous tracking schema is clean:
shell>replicator start offline
shell>trepctl -service
servicename
reset -all -y
Step 9: Position the Replicator
The final step is to now position the extractor to pick up from the position that the export in step 5 was taken.
Locate the dump file and issue the following command:
shell> grep "CHANGE MASTER" dump.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=847;
Taking the sequence value from MASTER_LOG_FILE filename and the MASTER_LOG_POS, issue the following statement on the EXTRACTOR host:
shell> trepctl online -from-event 000003:847
Once the command has completed, the extractor will be online, you can now bring the applier online
shell> trepctl online
Step 1: Configure the Temporary Instance
Build an empty MySQL Instance providing that all the pre-requisites are in place. These are outline in Appendix B, Prerequisites
To ensure compatibility, you need to make sure that the version of MySQL used matches the version of MySQL running on the main source.
Once the instance is running, you need to pre-create all of the tables that you
will be loading. This must be done manually as you need to ensure that each
table is created with the ENGINE=BLACKHOLE
option
The use of the BLACKHOLE
engine will mean that the data doesn't actually get stored
when written to the database, however binary logs are generated and it is these that
the replicator requires
Step 2: Configure the Replicator
Follow the steps outlined in Section 3.2, “Deploying a Primary/Replica Topology” to configure an extractor against the Temporary host.
Ensure that the configuration includes the following entries to enable heterogeneous replication, and ensure you qualify the objects (schemas and/or tables) that you want to seed
enable-heterogeneous-master=true
svc-extractor-filters=replicate
property=replicator.filter.replicate.do=schema.table
Once installed, start the replicator
Step 3: Build the Target Schema
Depending on the target, you may need to pre-create the final objects
in your target environment. Use ddlscan
to do this now
if this is required
Step 4: Configure the Applier
In a Cluster-Extractor environment, we will use the same applier after we have seeded the target.
Ensure the applier host meets all the required pre-requisites and then configure the applier appropriate to the target you are applying to.
Follow the appropriate steps in Chapter 4, Deploying Appliers to configure a standalone applier, ensuring that it is configured to connect to the temporary extractor installed in Step 2. At this stage do not follow the Cluster-Extractor setup
When setting the service name for this temporary seeding process, ensure the servicename you choose matches the servicename of the main source cluster that we will later connect to for normal operation
Once configured, start the applier. At this stage you should not see any replication traffic since the temporary host will have no data written to it
Step 5: Export the data from the source
We now need to export the data from the source. There are two ways to do this with a cluster. We can either take an export from the Primary, or we an take an export from a Replica.
Export from a Primary:
If you are able to export from the Primary, then using mysqldump we need to ensure we capture the binlog position and we also need to ensure that the export does NOT contain DB or TABLE DDL
The following example can be used as a template. In this example we are exporting an entire schema. Use the appropriate options if you require only specific tables
mysqldump -u root -psecret -B hr --no-create-db --no-create-info --master-data=2 >dump.sql
Export from a Replica:
To export from a Replica, we cannot obtain the binlog position correctly as the one we need is specific to the Primary, however, exporting from a Replica means that we can utilise the features of Tungsten Cluster to isolate the node
First, set the cluster to MAINTENANCE mode, and then SHUN the node that we wish to export from
shell>cctrl
cctrl>set policy maintenance
cctrl>datasource
Replicahost
shun
Next, we can take the export
mysqldump -u root -psecret -B hr --no-create-db --no-create-info >dump.sql
The final step is to capture the current replication position using dsctl
shell> dsctl get -ascmd
dsctl set -seqno 9 -epoch 2 -event-id "mysql-bin.000003:0000000000002608;-1" -source-id "db1"
Make a note of the output from running this command as we will need it later
Finally, you can re-introduce the node back into the cluster
shell>cctrl
cctrl>datasource
cctrl>Replicahost
recoverset policy automatic
Step 6: Import the Data
Now that we have the data we can import this into the Temporary Instance
As you load the data, you can monitor replication and you should see the data loading into your target envrionment.
Providing you created the tables correctly with the BLACKHOLE
engine,
you should see that a select count
on the tables in the temporary
instance should return a row count of zero.
When the load has finished and the applier has completed replication, stop both replicators using the following command:
shell> replicator stop
We have now finished with the temporary MySQL instance
Step 7: Reconfigure the Applier as a Cluster-Extractor
We now need to reconfigure the applier as a cluster Replica, but first we need to uninstall the software to ensure we have a clean build and any THL from the pre-load has been cleared
shell>cd /opt/continuent/software/tungsten-replicator-
shell>7.1.4-10
tools/tpm uninstall --i-am-sure
Place the correct configuration into the /etc/tungsten/tungsten.ini
file
ensuring that start-and-report=false
is set
Install the software
shell>cd /opt/continuent/software/tungsten-replicator-
shell>7.1.4-10
tools/tpm install
Finally start the repicator in an offline state:
shell> replicator start offline
Step 8: Position the Replicator
The final step is to now position the replicator to pick up from the position that the export in step 5 was taken.
If you took the export from the Replica and already have a dsctl set command, then move onto the next step. If you took an export from the Primary then we need to retrieve the correct positions from the dump file
Locate the dump file and issue the following command:
shell> grep "CHANGE MASTER" dump.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=847;
Taking the sequence value from MASTER_LOG_FILE
filename and the
MASTER_LOG_POS
, issue the following
statement on the Primary host in the cluster, and copy the resulting dsctl command:
shell> tungsten_find_position mysql-bin.000003:847
dsctl set -reset -seqno 9 -epoch 2 -event-id "mysql-bin.000003:0000000000000847;-1" -source-id "db1"
Taking the dsctl that you obtained either from the Replica in Step 5 or from the steps just above, run the dsctl command on the applier host ONLY
Once the command has completed, you can now bring the replicator online
shell> replicator start offline