F.2. Character Sets in Database and Tungsten Cluster

Character sets within the databases and within the configuration for Java and the wrappers for Tungsten Replicator must match to enable the information to be extracted and viewed.

For example, if you are extracting with the UTF-8 character set, the data must be applied to the target database using the same character set. In addition, the Tungsten Replicator should be configured with a corresponding matching character set. For installations where replication is between identical database flavours (for example, MySQL and MySQL) no explicit setting should be made. For heterogeneous deployments, the character set should be set explicitly.

When installing and using Tungsten Replicator, be aware of the following aspects when using character sets:

  • When installing Tungsten Replicator, use the java-file-encoding to tpm to configure the character set.

  • When using the thl command, the character set may need to be explicitly stated to view the content correctly:

    shell> thl list -charset utf8

For more information on setting character sets within your database, see your providers documentation appropriate for the release you are using.