4.3.4. Troubleshooting Vertica Installations

The following items detail some of the more common problems with replication through to Vertica. Often the underlying issue is related to the data types, the data format, or the number of columns.

  • If the following is reported by the replicator:

    pendingError           : Replicator unable to go online due to error »
      Operation failed: Online operation failed (Unable to prepare plugin: class »
      name=com.continuent.tungsten.replicator.datasource.DataSourceService »
      message=[Unable to load driver: com.vertica.jdbc.Driver])
    state                  : OFFLINE:ERROR

    The Vertica JDBC driver is missing from the installation. The Vertica JDBC JAR file must have been placed into the tungsten-replicator/lib directory within the release diectory before running tpm update or tpm install.

  • The following error:

    pendingExceptionMessage: Invalid write to CSV file: name=/opt/continuent/tmp/staging/alpha/staging0/test-msg-1.csv »
      table=test.msg table_columns=schemaname,schemahash csv_columns=tungsten_opcode,tungsten_seqno, »
      tungsten_row_id,tungsten_commit_timestamp,nullschemaname,schemahash

    Indicates the source THL has been not been marked up correctly. Either the colnames filter has not been enabled, or the --enable-batch-service has not been confifgred during installation. This means that the source THL is not being populated with the right information, either the full list of columns, or the column names and primary key information is incorrect. The configuration should be updated, and then the THL on both the Extractor and Applier should be recreated by using trepctl reset.

  • If you get an error similar to the following:

    pendingExceptionMessage: CSV loading failed: schema=test table=msg CSV »
     file=/opt/continuent/tmp/staging/alpha/staging0/test-msg-1.csv »
     message=com.continuent.tungsten.replicator.ReplicatorException: Incoming table data »
     has no primary keys: test.msg »
     (/opt/continuent/tungsten/tungsten-replicator/appliers/batch/vertica6.js#70)

    Either the pkey filter has not been enabled, or the source tables on the source database do not contain primary keys. This means that the source THL is not being populated with the primary key information from the table which is requird in order to load into Vertica through the batch mechanism. The configuration should be updated, and then the THL on both the Extractor and Applier should be recreated by using trepctl reset.

  • The following error indicates that the incoming data could not be loaded into the staging table within Vertica:

    pendingError  : Stage task failed: q-to-dbms
    pendingExceptionMessage: CSV loading failed: schema=blog table=article CSV »
      file=/tmp/staging/alpha/staging0/blog-article-432.csv »
      message=com.continuent.tungsten.replicator.ReplicatorException:
      LOAD DATA ROW count does not match: sql=COPY blog.stage_xxx_article »
      FROM '/tmp/staging/alpha/staging0/blog-article-432.csv' »
      DIRECT NULL 'null' DELIMITER ',' ENCLOSED BY '"' »
      expected_copy_rows=3614 rows=2233 ; exceptions are in »
      /tmp/tungsten_vertica_blog.article.exceptions »
      (../../tungsten-replicator//samples/scripts/batch/vertica6.js#67)

    There are a number of possible reasons for this. The actual reasons can be found in the exceptions file which is generated, the error message contains the location. In this example /tmp/tungsten_vertica_blog.article.exceptions. Possible reasons include:

    • Mismatch in the number of columns in the source file and the target table. Check the source and target tables match, including the four special fields used in all staging tables.

    • Mismatch in the data types of one or more of the columns in target table. Check the source and target table definitions match, or at least support the corresponding data. For example, the column size, length or format is correct. Loading character data into numeric columns, or floating point values into integer columns for example is not supported.

    • Badly formatted CSV file. This happens when the incoming data contains newliness or commas or other data that is incompatible with the CSV format. The CSV file should have been kept, the location is also in the error message. Examine the file and check the format. You may need to enable filters to modify and 'clean' the data so that it is more compatible with the CSV format.

  • Remember that changes to the DDL within the source database are not automatically replicated to Vertica. Changes to the table definitions, additional tables, or additional databases, must all be updated manually within Vertica.

  • If you get errors similar to:

    stage_xxx_access_log does not exist

    When loading into Vertica, it means that the staging tables have not created correctly. Check the steps for creating the staging tables using ddlscan in Section 4.3.1, “Preparing for Vertica Deployments”.

  • Replication may fail if date types contain zero values, which are legal in MySQL. For example, the timestamp 0000-00-00 00:00:00 is valid in MySQL. An error reporting a mismatch in the values will be reported when applying the data into Vertica, for example:

    ERROR 2631:  Column "time" is of type timestamp but expression is of type int
    HINT:  You will need to rewrite or cast the expression

    Or:

    ERROR 2992:  Date/time field value out of range: "0"
    HINT:  Perhaps you need a different "datestyle" setting

    To address this error, use the zerodate2null filter, which translates zero-value dates into a valid NULL value. This can be enabled by adding the zerodate2null filter to the applier stage when configuring the service using tpm:

    shell> ./tools/tpm update alpha --repl-svc-applier-filters=zerodate2null