Troubleshooting Vertica Installations
The following items detail some of the more common problems with replication through to Vertica. Often the underlying issue is related to the data types, the data format, or the number of columns.
If the following is reported by the replicator:
pendingError : Replicator unable to go online due to errorOperation failed: Online operation failed (Unable to prepare plugin: classname=com.continuent.tungsten.replicator.datasource.DataSourceServicemessage=[Unable to load driver: com.vertica.jdbc.Driver])state : OFFLINE:ERRORThe Vertica JDBC driver is missing from the installation. The Vertica JDBC JAR file must have been placed into the
tungsten-replicator/libdirectory within the release diectory before runningtpm updateortpm install.The following error:
pendingExceptionMessage: Invalid write to CSV file: name=/opt/continuent/tmp/staging/alpha/staging0/test-msg-1.csvtable=test.msg table_columns=schemaname,schemahash csv_columns=tungsten_opcode,tungsten_seqno,tungsten_row_id,tungsten_commit_timestamp,nullschemaname,schemahashIndicates the source THL has been not been marked up correctly. Either the colnames filter has not been enabled, or
enable-batch-servicehas not been configured during installation. This means that the source THL is not being populated with the right information, either the full list of columns, or the column names and primary key information is incorrect. The configuration should be updated, and then the THL on both the Extractor and Applier should be recreated by usingtrepctl restore.If you get an error similar to the following:
pendingExceptionMessage: CSV loading failed: schema=test table=msg CSVfile=/opt/continuent/tmp/staging/alpha/staging0/test-msg-1.csvmessage=com.continuent.tungsten.replicator.ReplicatorException: Incoming table datahas no primary keys: test.msg(/opt/continuent/tungsten/tungsten-replicator/appliers/batch/vertica6.js#70)Either the pkey filter has not been enabled, or the source tables on the source database do not contain primary keys. This means that the source THL is not being populated with the primary key information from the table which is required in order to load into Vertica through the batch mechanism. The configuration should be updated, and then the THL on both the Extractor and Applier should be recreated by using
trepctl restore.The following error indicates that the incoming data could not be loaded into the staging table within Vertica:
pendingError : Stage task failed: q-to-dbmspendingExceptionMessage: CSV loading failed: schema=blog table=article CSVfile=/tmp/staging/alpha/staging0/blog-article-432.csvmessage=com.continuent.tungsten.replicator.ReplicatorException:LOAD DATA ROW count does not match: sql=COPY blog.stage_xxx_articleFROM '/tmp/staging/alpha/staging0/blog-article-432.csv'DIRECT NULL 'null' DELIMITER ',' ENCLOSED BY '"'expected_copy_rows=3614 rows=2233 ; exceptions are in/tmp/tungsten_vertica_blog.article.exceptions(../../tungsten-replicator//samples/scripts/batch/vertica6.js#67)There are a number of possible reasons for this. The actual reasons can be found in the exceptions file which is generated, the error message contains the location. In this example
/tmp/tungsten_vertica_blog.article.exceptions. Possible reasons include:Mismatch in the number of columns in the source file and the target table. Check the source and target tables match, including the four special fields used in all staging tables.
Mismatch in the data types of one or more of the columns in target table. Check the source and target table definitions match, or at least support the corresponding data. For example, the column size, length or format is correct. Loading character data into numeric columns, or floating point values into integer columns for example is not supported.
Badly formatted CSV file. This happens when the incoming data contains newlines or commas or other data that is incompatible with the CSV format. The CSV file should have been kept, the location is also in the error message. Examine the file and check the format. You may need to enable filters to modify and 'clean' the data so that it is more compatible with the CSV format.
Remember that changes to the DDL within the source database are not automatically replicated to Vertica. Changes to the table definitions, additional tables, or additional databases, must all be updated manually within Vertica.
If you get errors similar to:
stage_xxx_access_log does not existWhen loading into Vertica, it means that the staging tables have not created correctly. Check the steps for creating the staging tables using
ddlscanin "Preparing for Vertica Deployments".Replication may fail if date types contain zero values, which are legal in MySQL. For example, the timestamp
0000-00-00 00:00:00is valid in MySQL. An error reporting a mismatch in the values will be reported when applying the data into Vertica, for example:ERROR 2631: Column "time" is of type timestamp but expression is of type intHINT: You will need to rewrite or cast the expressionOr:
ERROR 2992: Date/time field value out of range: "0"HINT: Perhaps you need a different "datestyle" settingTo address this error, use the
zerodate2nullfilter, which translates zero-value dates into a valid NULL value. This can be enabled by adding thezerodate2nullfilter to the applier stage when configuring the service usingtpm:shell> ./tools/tpm update alpha --repl-svc-applier-filters=zerodate2null