When a mismatch occurs, the replicator service will indicate that there was a problem applying a transaction on the Replica. The replication process stops applying changes to the Replica when the first transaction fails to be applied to the Replica. This prevents multiple-statements from failing
trepctl status... pendingError : Event application failed: seqno=120 fragno=0 message=java.sql.SQLException: » Statement failed on slave but succeeded on master pendingErrorCode : NONE pendingErrorEventId : mysql-bin.000012:0000000000012967;0 pendingErrorSeqno : 120 pendingExceptionMessage: java.sql.SQLException: Statement failed on slave but succeeded on master insert into messages values (0,'Trial message','Jack','Jill',now()) ...
trepsvc.log log file will also
contain the error information about the failed statement. For example:
... INFO | jvm 1 | 2013/06/26 10:14:12 | 2013-06-26 10:14:12,423 [firstcluster - q-to-dbms-0] INFO pipeline.SingleThreadStageTask Performing emergency rollback of applied changes INFO | jvm 1 | 2013/06/26 10:14:12 | 2013-06-26 10:14:12,424 [firstcluster - q-to-dbms-0] INFO pipeline.SingleThreadStageTask Dispatching error event: Event application failed: seqno=120 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master INFO | jvm 1 | 2013/06/26 10:14:12 | 2013-06-26 10:14:12,424 [firstcluster - pool-2-thread-1] ERROR management.OpenReplicatorManager Received error notification, shutting down services : INFO | jvm 1 | 2013/06/26 10:14:12 | Event application failed: seqno=120 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master INFO | jvm 1 | 2013/06/26 10:14:12 | insert into messages values (0,'Trial message', 'Jack','Jill',now()) INFO | jvm 1 | 2013/06/26 10:14:12 | com.continuent.tungsten.replicator.applier.ApplierException: java.sql.SQLException: Statement failed on slave but succeeded on master ...
Once the error or problem has been found, the exact nature of the error should be determined so that a resolution can be identified:
Identify the reason for the failure by examining the full error message. Common causes are:
Duplicate primary key
A row or statement is being inserted or updated that already has the same insert ID or would generate the same insert ID for tables that have auto increment enabled. The insert ID can be identified from the output of the transaction using thl. Check the Replica to identify the faulty row. To correct this problem you will either need to skip the transaction or delete the offending row from the Replica dataserver.
The error will normally be identified due to the following error message when viewing the current replicator status, for example:
trepctl status... pendingError : Event application failed: seqno=10 fragno=0 » message=java.sql.SQLException: Statement failed on slave but succeeded on master pendingErrorCode : NONE pendingErrorEventId : mysql-bin.000032:0000000000001872;0 pendingErrorSeqno : 10 pendingExceptionMessage: java.sql.SQLException: Statement failed on slave but succeeded on master insert into myent values (0,'Test Message') ...
The error can be generated when an insert or update has taken place on the Replica rather than on the Primary.
To resolve this issue, check the full THL for the statement that failed. The information is provided in the error message, but full examination of the THL can help with identification of the full issue. For example, to view the THL for the sequence number:
thl list -seqnoSEQ# = 10 / FRAG# = 0 (last frag) - TIME = 2014-01-09 16:47:40.0 - EPOCH# = 1 - EVENTID = mysql-bin.000032:0000000000001872;0 - SOURCEID = host1 - METADATA = [mysql_server_id=1;dbms_type=mysql;service=firstcluster;shard=test] - TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent - SQL(0) = SET INSERT_ID = 2 - OPTIONS = [##charset = UTF-8, autocommit = 1, sql_auto_is_null = 0, foreign_key_checks = 1, » unique_checks = 1, sql_mode = '', character_set_client = 33, collation_connection = 33, » collation_server = 8] - SCHEMA = test - SQL(1) = insert into myent values (0,'Test Message')
In this example, an
operation is inserting a new row. The generated insert ID is also
shown (in line 9,
Check the destination database and determine the what the current
value of the corresponding row:
select * from myent where id = 2;+----+---------------+ | id | msg | +----+---------------+ | 2 | Other Message | +----+---------------+ 1 row in set (0.00 sec)
The actual row values are different, which means that either value may be correct. In complex data structures, there may be multiple statements or rows that trigger this error if following data also relies on this value.
For example, if multiple rows have been inserted on the Replica, multiple transactions may be affected. In this scenario, checking multiple sequence numbers from the THL will highlight this information.
Missing table or schema
If a table or database is missing, this should be reported in the detailed error message. For example:
Caused by: java.sql.SQLSyntaxErrorException: Unable to switch to database » 'contacts'Error was: Unknown database 'contacts'
This error can be caused when maintenance has occurred, a table has failed to be initialized properly, or the
Incompatible table or schema
A modified table structure on the Replica can cause application of the transaction to fail if there are missing or different column specifications for the table data.
This particular error can be generated when changes to the table definition have been made, perhaps during a maintenance window.
Check the table definition on the Primary and Replica and ensure they match.
Choose a resolution method:
Depending on the data structure and environment, resolution can take one of the following forms:
Skip the transaction on the Replica
If the data on the Replica is considered correct, or the data in
both tables is the same or similar, the transaction from the
Primary to the Replica can be skipped. This process involves placing
the replicator online and specifying one or more transactions to
be skipped or ignored. At the end of this process, the replicator
should be in the
For more information on skipping single or multiple transactions, see Section 7.5.2, “Skipping Transactions”.
Delete the offending row or rows on the Replica
If the data on the Primary is considered canonical, then the data on the Replica can be removed, and the replicator placed online.
Deleting data on the Replica may cause additional problems if the data is used by other areas of your application, relations to foreign tables.
delete from myent where id = 2;Query OK, 1 row affected (0.01 sec)
Now place the replicator online and check the status:
Restore or reprovision the Replica
If the transaction cannot be skipped, or the data safely deleted or modified, and only a single Replica is affected, a backup of an existing, working, Replica can be taken and restored to the broken Replica.
The tungsten_provision_slave command automates this process. See Section 7.6, “Provision or Reprovision a Replica” for more information on reprovisioning.
To perform a backup and restore, see Section 7.7, “Creating a Backup”, or Section 7.8, “Restoring a Backup”. To reprovision a Replica from the Primary or another Replica, see tungsten_provision_slave.