5.6.7. Supported CSV Formats

Tungsten Replicator supports a number of CSV formats that can and should be used with specific heterogeneous environments when using the batch loading process, or generating CSV files in general for testing or loading.

A number of standard types are included, and the use of these standard types when generating CSV is controlled by the replicator.datasource.global.csvType property. Depending on the configured target, the corresponding type will be configured automatically. For example, if you configure a Vertica deployment, the replicator will be configured to default to the Vertica style CSV format.

Warning

Using the wrong CSV format with a given target may break replication. You should always use the appropriate CSV format for the defined target.

Table 5.1. Continuent Tungsten Directory Structure

Format Field Separator Record Separator Escape Sequence Escaped Characters Null Policy Null Value Show Headers Use Quotes Quote String Suppressed Characters
hive \u0001 \n \\ \u0001\\ Use Null Value \\N false false \n\r
mysql , \n \\ \\ Use Null Value \\N false true \"  
oracle , \n \\ \\ Use Null Value \\N false true \"  
vertica , \n \\ \\ Skip Value false true \" \n
redshift , \n \" Skip Value false true \" \n

In addition to the standardised types, the replicator.datasource.global.csvType property can be set to custom, in which case the following configurable values are used instead:

  • replicator.datasource.global.csv.fieldSeparator — the character used to separate fields, such as , (comma).

  • replicator.datasource.global.csv.RecordSeparator — the character used to separate records, such as the newline character.

  • replicator.datasource.global.csv.nullValue — the value to use for NULL (empty) values.

  • replicator.datasource.global.csv.useQuotes — whether to use quotes to encapsulate field values (specified using true or false).

  • replicator.datasource.global.csv.useHeaders — whether to include the column headers in the generated CSV (specified using true or false).