The following items detail some of the more common problems with replication through to Vertica. Often the underlying issue is related to the data types, the data format, or the number of columns.
If the following is reported by the replicator:
pendingError : Replicator unable to go online due to error » Operation failed: Online operation failed (Unable to prepare plugin: class » name=com.continuent.tungsten.replicator.datasource.DataSourceService » message=[Unable to load driver: com.vertica.jdbc.Driver]) state : OFFLINE:ERROR
The Vertica JDBC driver is missing from the installation. The Vertica
JDBC JAR file must have been placed into the
tungsten-replicator/lib
directory within the release diectory before running tpm
update or tpm install.
The following error:
pendingExceptionMessage: Invalid write to CSV file: name=/opt/continuent/tmp/staging/alpha/staging0/test-msg-1.csv » table=test.msg table_columns=schemaname,schemahash csv_columns=tungsten_opcode,tungsten_seqno, » tungsten_row_id,tungsten_commit_timestamp,nullschemaname,schemahash
Indicates the source THL has been not been marked up correctly. Either
the colnames
filter has not been
enabled, or the
--enable-batch-service
has not been
confifgred during installation. This means that the source THL is not
being populated with the right information, either the full list of
columns, or the column names and primary key information is incorrect.
The configuration should be updated, and then the THL on both the
Extractor and Applier should be recreated by using trepctl
reset.
If you get an error similar to the following:
pendingExceptionMessage: CSV loading failed: schema=test table=msg CSV » file=/opt/continuent/tmp/staging/alpha/staging0/test-msg-1.csv » message=com.continuent.tungsten.replicator.ReplicatorException: Incoming table data » has no primary keys: test.msg » (/opt/continuent/tungsten/tungsten-replicator/appliers/batch/vertica6.js#70)
Either the pkey
filter has not
been enabled, or the source tables on the source database do not
contain primary keys. This means that the source THL is not being
populated with the primary key information from the table which is
requird in order to load into Vertica through the batch mechanism.
The configuration should be updated, and then the THL on both the
Extractor and Applier should be recreated by using trepctl
reset.
The following error indicates that the incoming data could not be loaded into the staging table within Vertica:
pendingError : Stage task failed: q-to-dbms pendingExceptionMessage: CSV loading failed: schema=blog table=article CSV » file=/tmp/staging/alpha/staging0/blog-article-432.csv » message=com.continuent.tungsten.replicator.ReplicatorException: LOAD DATA ROW count does not match: sql=COPY blog.stage_xxx_article » FROM '/tmp/staging/alpha/staging0/blog-article-432.csv' » DIRECT NULL 'null' DELIMITER ',' ENCLOSED BY '"' » expected_copy_rows=3614 rows=2233 ; exceptions are in » /tmp/tungsten_vertica_blog.article.exceptions » (../../tungsten-replicator//samples/scripts/batch/vertica6.js#67)
There are a number of possible reasons for this. The actual reasons
can be found in the exceptions file which is generated, the error
message contains the location. In this example
/tmp/tungsten_vertica_blog.article.exceptions
.
Possible reasons include:
Mismatch in the number of columns in the source file and the target table. Check the source and target tables match, including the four special fields used in all staging tables.
Mismatch in the data types of one or more of the columns in target table. Check the source and target table definitions match, or at least support the corresponding data. For example, the column size, length or format is correct. Loading character data into numeric columns, or floating point values into integer columns for example is not supported.
Badly formatted CSV file. This happens when the incoming data contains newliness or commas or other data that is incompatible with the CSV format. The CSV file should have been kept, the location is also in the error message. Examine the file and check the format. You may need to enable filters to modify and 'clean' the data so that it is more compatible with the CSV format.
Remember that changes to the DDL within the source database are not automatically replicated to Vertica. Changes to the table definitions, additional tables, or additional databases, must all be updated manually within Vertica.
If you get errors similar to:
stage_xxx_access_log does not exist
When loading into Vertica, it means that the staging tables have not created correctly. Check the steps for creating the staging tables using ddlscan in Section 4.3.1, “Preparing for Vertica Deployments”.
Replication may fail if date types contain zero values, which are
legal in MySQL. For example, the timestamp
0000-00-00 00:00:00
is valid in
MySQL. An error reporting a mismatch in the values will be reported
when applying the data into Vertica, for example:
ERROR 2631: Column "time" is of type timestamp but expression is of type int HINT: You will need to rewrite or cast the expression
Or:
ERROR 2992: Date/time field value out of range: "0" HINT: Perhaps you need a different "datestyle" setting
To address this error, use the
zerodate2null
filter, which
translates zero-value dates into a valid NULL value. This can be
enabled by adding the
zerodate2null
filter to the
applier stage when configuring the service using
tpm:
shell> ./tools/tpm update alpha --repl-svc-applier-filters=zerodate2null