Skip to main content
Common Reference

The ddlscan Command

The ddlscan command scans the existing schema for a database or table and then generates a schema or file in a target database environment. For example, ddlscan is used in MySQL to Oracle heterogeneous deployments to translate the schema definitions within MySQL to the Oracle format. For more information on heterogeneous deployments, see "Understanding Heterogeneous Deployments".

For example, to generate Oracle DDL from an existing MySQL database:

shell> ddlscan -user tungsten -url 'jdbc:mysql:thin://tr-source:13306/test' -pass password \
-template ddl-mysql-oracle.vm -db test

SQL generated on Thu Sep 11 15:39:06 BST 2019 by ./ddlscan utility of Tungsten

url = jdbc:mysql:thin://tr-source:13306/test
user = tungsten
dbName = test
*/

DROP TABLE test.sales;
CREATE TABLE test.sales
(
id NUMBER(10, 0) NOT NULL,
salesman CHAR,
planet CHAR,
value FLOAT,
PRIMARY KEY (id)
);
Usage
ddlscan [options]

Where:

OptionDescription
-confPath to a static-[svc].properties file to read JDBC connection address and credentials.
-dbDatabase to use (will substitute ${DBNAME} in the URL, if needed)
-helpGenerates the help text of arguments.
-optOption(s) to pass to template. HINT: try -opt help me
-outRender to file (print to stdout if not specified)
-passJDBC Password.
-pathAdd additional search path for loading Velocity templates.
-renameDefinitions file for renaming schemas, tables and columns.
-serviceName of a replication service instead of path to config.
-tableFileNew-line separated definitions file of tables to find.
-tablesComma-separated list of tables to find.
-templateSpecify template file to render.
-urlJDBC connection string (use single quotes to escape)
-userJDBC Username.

ddlscan supports three different methods for execution:

  1. Using an explicit JDBC URL, username and password:

    shell> ddlscan -user tungsten -url 'jdbc:mysql:thin://tr-hadoop1:13306/test' -user user \
    -pass password ...

    This is useful when a deployment has not already been installed.

  2. By specifying an explicit configuration file:

    shell> ddlscan -conf /opt/continuent/tungsten/tungsten-replicator/conf/static-alpha.properties ...
  3. When an existing deployment has been installed, by specifying one of the active services:

    shell> ddlscan -service alpha ...

In addition, the following two options must be specified on the command-line:

  • The template to be used (using the -template option) for the DDL translation must be specified on the command-line. A list of the support templates and their operation are shown below in "Supported Templates and Usage"
  • The -db parameter, which defines the database or schema that should be scanned. All tables are translated unless an explicit list, regex, or table file has been specified.

For example, to translate MySQL DDL to Oracle for all tables within the schema test using the connection to MySQL defined in the service alpha:

shell> ddlscan -service alpha -template ddl-mysql-oracle.vm -db test

ddlscan Options Reference

The details of all the options available to ddlscan are as follows:

-conf

Path to a static-[svc].properties file to read JDBC connection address and credentials.

-db

Database to use (will substitute ${DBNAME} in the URL, if needed)

-help

Generates the help text of arguments.

-opt

An additional option (and variable) which are supplied to be used within the template file. Different template files may support additional options for specifying alternative information, such as schema names, file locations and other values.
shell> ddlscan -service alpha -template ddl-mysql-oracle.vm -db test -opt schemaPrefix mysql_

-out

Sends the generated DDL output to a file, in place of sending it to standard output.

-pass

JDBC Password.

-path

The path to additional Velocity templates to be searched when specifying the template name.

-rename

A list of table renames which will be taken into account when generating target DDL. The format of the table matches the format of the rename filter.

-service

Name of a replication service instead of path to config.

-tableFile

A file containing a list of the files to be extracted. The file should be formatted as comma-separated Values (CSV), only the first column is extracted. For example, a file called tablelist.txt containing the following:
typetwo,Table of type two customer forms
typethree,Table of type three customer forms
Could be used with ddlscan as follows:
shell> ddlscan -service alpha -template ddl-mysql-oracle.vm -db test -tableFile tablelist.txt

-tables

A comma-separated list of the tables to be extracted, for example:
shell> ddlscan -service alpha -template ddl-mysql-oracle.vm -db test -tables typetwo,typethree

-template

Specify template file to render.

-url

JDBC connection string (use single quotes to escape)

-user

JDBC Username.

Supported Templates and Usage

OptionDescription
ddl-check-pkeys.vmReports which tables are without primary key definitions.
ddl-mysql-hive-0.10-staging.vmGenerates DDL from a MySQL host suitable for the staging tables in a Hadoop/Hive Environment.
ddl-mysql-hive-0.10.vmGenerates DDL from a MySQL host suitable for the base tables in a Hadoop/Hive Environment.
ddl-mysql-hive-metadata.vmGenerates metadata as JSON to be used within a Hadoop/Hive Environment.
ddl-mysql-oracle-cdc.vmGenerates Oracle tables with CDC capture information from a MySQL schema.
ddl-mysql-oracle.vmGenerates Oracle schema from a MySQL schema.
ddl-mysql-postgres-staging.vmGenerates DDL from a MySQL host suitable for the base tables in PostgreSQL.
ddl-mysql-postgres.vmGenerates DDL from a MySQL host suitable for the base tables in PostgreSQL.
ddl-mysql-redshift-staging.vmGenerates DDL from a MySQL host suitable for the staging tables in Amazon Redshift.
ddl-mysql-redshift.vmGenerates DDL from a MySQL host suitable for the base tables in Amazon Redshift.
ddl-mysql-vertica-staging.vmGenerates DDL suitable for the staging tables in HP Vertica.
ddl-mysql-vertica.vmGenerates DDL suitable for the base tables in HP Vertica.

ddl-check-pkeys.vm

The ddl-check-pkeys.vm template can be used to check whether specific tables within a schema do not have a primary key:
shell> ddlscan -template ddl-check-pkeys.vm -user tungsten -pass password -db sales -url jdbc:mysql://localhost:13306/sales
/*
SQL generated on Thu Sep 04 10:23:52 BST 2014 by ./ddlscan utility of Tungsten

url = jdbc:mysql://localhost:13306/sales
user = tungsten
dbName = sales
*/

/* ERROR: sales.dummy1 has no primary key! *//*
SQL generated on Thu Sep 04 10:23:52 BST 2014 by ./ddlscan utility of Tungsten

url = jdbc:mysql://localhost:13306/sales
user = tungsten
dbName = sales
*/

/* ERROR: sales.dummy2 has no primary key! *//*
SQL generated on Thu Sep 04 10:23:52 BST 2014 by ./ddlscan utility of Tungsten

url = jdbc:mysql://localhost:13306/sales
user = tungsten
dbName = sales
*/
For certain environments, particularly heterogeneous replication, the lack of primary keys can lead to inefficient replication, or even fail to replicate data at all.

ddl-mysql-hive-0.10-staging.vm

Staging tables within Hive define the original table columns with additional columns to track the operation type, sequence number, timestamp and unique key for each row. For example, the table sales in MySQL:
mysql> describe sales;
+----------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| salesman | char(20) | YES | | NULL | |
| planet | char(20) | YES | | NULL | |
| value | float | YES | | NULL | |
+----------+----------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
Generates the following Hive-compatible DDL when using this template:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-hadoop1:13306/test' -pass password -template ddl-mysql-hive-0.10-staging.vm -db test
--
-- SQL generated on Thu Sep 11 12:31:45 BST 2014 by Tungsten ddlscan utility
--
-- url = jdbc:mysql://tr-hadoop1:13306/test
-- user = tungsten
-- dbName = test
--

DROP TABLE IF EXISTS test.stage_xxx_sales;

CREATE EXTERNAL TABLE test.stage_xxx_sales
(
tungsten_opcode STRING ,
tungsten_seqno INT ,
tungsten_row_id INT ,
tungsten_commit_timestamp TIMESTAMP ,
id INT,
salesman STRING,
planet STRING,
value DOUBLE )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' ESCAPED BY '\\'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION '/user/tungsten/staging/test/sales' ;
Wherever possible, the closest Hive equivalent datatype is used for each source datatype, see ddl-mysql-hive-0.10.vm for more information.

ddl-mysql-hive-0.10.vm

Generates DDL suitable for a carbon-copy form of the table from the MySQL host:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-hadoop1:13306/test' -pass password -template ddl-mysql-hive-0.10.vm -db test
--
-- SQL generated on Thu Sep 11 12:57:11 BST 2014 by Tungsten ddlscan utility
--
-- url = jdbc:mysql://tr-hadoop1:13306/test
-- user = tungsten
-- dbName = test
--

DROP TABLE IF EXISTS test.sales;

CREATE TABLE test.sales
(
id INT,
salesman STRING,
planet STRING,
value DOUBLE )
;
Wherever possible, the closest Hive equivalent datatype is used for each source datatype, as follows:
MySQL DatatypeHive Datatype
DATETIMESTRING
TIMESTAMPTIMESTAMP
DATESTRING
YEARINT
TIMESTRING
TINYINTTINYINT
TINYINT UNSIGNEDSMALLINT
SMALLINTSMALLINT
SMALLINT UNSIGNEDINT
MEDIUMINTINT
INTINT
INT UNSIGNEDBIGINT
BIGINTBIGINT
BIGINT UNSIGNEDSTRING
DECIMALSTRING
VARCHARSTRING
CHARSTRING
BINARYBINARY
VARBINARYBINARY
TEXTSTRING
BLOBBINARY
FLOATDOUBLE
DOUBLEDOUBLE
ENUMSTRING
SETSTRING
BITSTRING
The template supports the following optional parameters to change behavior:
  • -opt schemaPrefix
    A prefix to be placed in front of all schemas. For example, if called withschemaPrefix set to mysql_:
    shell> ddlscan ... -opt schemaPrefix mysql_
    The schema name will be prefixed, translating the schema name from sales intomysql_sales.
  • -opt tablePrefix
    A prefix to be placed in front of all schemas. For example, if called withtablePrefix set to mysql_:
    shell> ddlscan ... -opt tablePrefix mysql_
    The table name will be prefixed, translating the tablename from sales intomysql_sales.

ddl-mysql-hive-metadata.vm

The Hadoop tools require information about the schema in JSON format so that the table names and primary key information can be used when materializing data from the staging tables into the base tables. This template generates that information in JSON format:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-hadoop1:13306/test' -pass password -template ddl-mysql-hive-metadata.vm -db test

{
"tables": [
{
"schema": "test",
"name": "sales",
"keys": ["id"],
"columns": [
{"name": "id", "type": "INT"},
{"name": "salesman", "type": "STRING"},
{"name": "planet", "type": "STRING"},
{"name": "value", "type": "DOUBLE"}
]
}
]
}

ddl-mysql-oracle-cdc.vm

The ddl-mysql-oracle-cdc.vm template generates identical tables in Oracle, from their MySQL equivalent, but with additional columns for CDC capture. For example:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-hadoop1:13306/test' -pass password -template ddl-mysql-oracle-cdc.vm -db test
/*
SQL generated on Thu Sep 11 13:17:05 BST 2014 by ./ddlscan utility of Tungsten

url = jdbc:mysql://tr-hadoop1:13306/test
user = tungsten
dbName = test
*/

DROP TABLE test.sales;
CREATE TABLE test.sales
(
id NUMBER(10, 0) NOT NULL,
salesman CHAR,
planet CHAR,
value FLOAT,
CDC_OP_TYPE VARCHAR(1), /* CDC column */
CDC_TIMESTAMP TIMESTAMP, /* CDC column */
CDC_SEQUENCE_NUMBER NUMBER PRIMARY KEY /* CDC column */
);
For information on the datatypes translated, see ddl-mysql-oracle.vm.

ddl-mysql-oracle.vm

When translating MySQL tables to Oracle compatible schema, the following datatypes are migrated to their closest Oracle equivalent:
MySQL DatatypeOracle Datatype
INTNUMBER(10, 0)
BIGINTNUMBER(19, 0)
TINYINTNUMBER(3, 0)
SMALLINTNUMBER(5, 0)
MEDIUMINTNUMBER(7, 0)
DECIMAL(x,y)NUMBER(x, y)
FLOATFLOAT
CHAR(n)CHAR(n)
VARCHAR(n)VARCHAR2(n) (n < 2000), CLOB (n > 2000)
DATEDATE
DATETIMEDATE
TIMESTAMPDATE
TEXTCLOB
BLOBBLOB
ENUM(...)VARCHAR(255)
ENUM(...)VARCHAR(4000)
BIT(1)NUMBER(1)
The following additional transformations happen automatically:
  • Table names are translated to uppercase.
  • Column names are translated to uppercase.
  • If a column name is a reserved word in Oracle, then the column name has an underscore character appended (for example, TABLE becomes TABLE_).
In addition to the above translations, errors will be raised for the following conditions:
  • If the table name starts with a number.
  • If the table name exceeds 30 characters in length.
  • If the table name is a reserved word in Oracle.
Warnings will be raised for the following conditions:
  • If the column or column name started with a number.
  • If the column name exceeds 30 characters in length, the column name will be truncated.
  • If the column name is a reserved word in Oracle.

ddl-mysql-postgres-staging.vm

The ddl-mysql-postgres-staging.vm template generates DDL for PostgreSQL staging tables. These include the full table definition, in addition to four columns used to define the staging data, including the operation code, sequence number and unique row ID. For example:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-pgsql:13306/hr' -pass password -template ddl-mysql-postgres-staging.vm -db hr
/*
SQL generated on Thu Sep 11 14:20:14 BST 2014 by ./ddlscan utility of Tungsten

url = jdbc:mysql://tr-pgsql:13306/hr
user = tungsten
dbName = hr
*/
CREATE SCHEMA hr;

DROP TABLE hr.stage_xxx_countries;
CREATE TABLE hr.stage_xxx_countries
(
tungsten_opcode CHAR(2),
tungsten_seqno BIGINT,
tungsten_row_id BIGINT,
tungsten_commit_timestamp TIMESTAMP,
country_id CHAR(2),
country_name VARCHAR(40),
region_id INT,
PRIMARY KEY (country_id)
);

ddl-mysql-postgres.vm

The ddl-mysql-postgres.vm template generates DDL for generating tables within a PostgreSQL database from an existing MySQL database schema. For example:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-pgsql:13306/hr' -pass password -template ddl-mysql-postgres.vm -db hr
/*
SQL generated on Thu Sep 11 14:20:14 BST 2014 by ./ddlscan utility of Tungsten

url = jdbc:mysql://tr-pgsql:13306/hr
user = tungsten
dbName = hr
*/
CREATE SCHEMA hr;

DROP TABLE hr.countries;
CREATE TABLE hr.countries
(
country_id CHAR(2),
country_name VARCHAR(40),
region_id INT,
PRIMARY KEY (country_id)
);

ddl-mysql-redshift-staging.vm

The ddl-mysql-redshift-staging.vm template generates DDL for Amazon Redshift tables from MySQL schemas. For example:
CREATE TABLE test.stage_xxx_all_mysql_types
(
tungsten_opcode CHAR(2),
tungsten_seqno INT,
tungsten_row_id INT,
tungsten_commit_timestamp TIMESTAMP,
my_id INT,
my_bit BOOLEAN /* BIT(1) */,
my_tinyint SMALLINT /* TINYINT(4) */,
my_boolean SMALLINT /* TINYINT(1) */,
my_smallint SMALLINT,
my_mediumint INT /* MEDIUMINT(9) */,
my_int INT,
my_bigint BIGINT,
my_decimal_10_5 DECIMAL(10,5),
my_float FLOAT,
my_double DOUBLE PRECISION /* DOUBLE */,
my_date DATE,
my_datetime DATETIME,
my_timestamp TIMESTAMP,
my_time VARCHAR(17) /* WARN: no pure TIME type in Redshift */,
my_year YEAR(4) /* ERROR: unrecognized (type=0, length=0) */,
my_char_10 CHAR(10),
my_varchar_10 VARCHAR(40) /* VARCHAR(10) */,
my_tinytext VARCHAR(65535) /* WARN: MySQL TINYTEXT translated to max VARCHAR */,
my_text VARCHAR(65535) /* WARN: MySQL TEXT translated to max VARCHAR */,
my_mediumtext VARCHAR(65535) /* WARN: MySQL MEDIUMTEXT translated to max VARCHAR */,
my_longtext VARCHAR(65535) /* WARN: MySQL LONGTEXT translated to max VARCHAR */,
my_enum_abc VARCHAR(1) /* ENUM('A','B','C') */,
my_set_def VARCHAR(65535) /* SET('D','E','F') */,
PRIMARY KEY (tungsten_opcode, tungsten_seqno, tungsten_row_id)
);
The actual translation of datatypes is identical to that found in ddl-mysql-redshift.vm.

ddl-mysql-redshift.vm

The ddl-mysql-redshift.vm template generates DDL for Amazon Redshift tables from MySQL schemas. For example:
CREATE TABLE test.all_mysql_types
(
my_id INT,
my_bit BOOLEAN /* BIT(1) */,
my_tinyint SMALLINT /* TINYINT(4) */,
my_boolean SMALLINT /* TINYINT(1) */,
my_smallint SMALLINT,
my_mediumint INT /* MEDIUMINT(9) */,
my_int INT,
my_bigint BIGINT,
my_decimal_10_5 DECIMAL(10,5),
my_float FLOAT,
my_double DOUBLE PRECISION /* DOUBLE */,
my_date DATE,
my_datetime DATETIME,
my_timestamp TIMESTAMP,
my_time VARCHAR(17) /* WARN: no pure TIME type in Redshift */,
my_year YEAR(4) /* ERROR: unrecognized (type=0, length=0) */,
my_char_10 CHAR(10),
my_varchar_10 VARCHAR(40) /* VARCHAR(10) */,
my_tinytext VARCHAR(65535) /* WARN: MySQL TINYTEXT translated to max VARCHAR */,
my_text VARCHAR(65535) /* WARN: MySQL TEXT translated to max VARCHAR */,
my_mediumtext VARCHAR(65535) /* WARN: MySQL MEDIUMTEXT translated to max VARCHAR */,
my_longtext VARCHAR(65535) /* WARN: MySQL LONGTEXT translated to max VARCHAR */,
my_enum_abc VARCHAR(1) /* ENUM('A','B','C') */,
my_set_def VARCHAR(65535) /* SET('D','E','F') */,
PRIMARY KEY (my_id)
);
Columns are translated as follows:
MySQL DatatypeRedshift Datatype
BIGINTBIGINT
BINARYBINARY, CHAR
BIT(1)BOOLEAN
BITCHAR
BLOBVARBINARY, VARCHAR
CHARCHAR
DATEDATE
DATETIMEDATETIME
DECIMALDECIMAL
DOUBLEDOUBLE PRECISION
ENUMVARCHAR
FLOATFLOAT
INTINT
LONGBLOBVARBINARY, CHAR
LONGTEXTVARCHAR
MEDIUMBLOBVARBINARY, CHAR
MEDIUMINTINT
MEDIUMTEXTVARCHAR
SETVARCHAR
SMALLINTSMALLINT
TEXTVARCHAR
TIMEVARCHAR
TIMESTAMPTIMESTAMP
TINYBLOBVARBINARY, CHAR
TINYINTSMALLINT
TINYTEXTVARCHAR
VARBINARYVARBINARY, CHAR
VARCHARVARCHAR
In addition to these explicit changes, the following other considerations are taken into account:
  • When translating the DDL for CHAR and VARCHAR columns, the actual column size is increased by a factor of four. This is because Redshift tables always stored data using 32-bit UTF characters and column sizes are in bytes, not characters. Therefore a CHAR(20) column is created as CHAR(80) within Redshift.
  • TEXT columns are converted to a Redshift VARCHAR of 65535 in length (the maximum allowed).
  • BLOB columns are converted to a Redshift VARBINARY of 65000 in length (the maximum allowed).
  • BIT columns with a size of 1 are converted to Redshift BOOLEAN columns, larger sizes are converted toCHAR columns of 64 bytes in length.
  • TIME columns are converted to a Redshift VARCHAR of 17 bytes in length since no explicitTIME type exists.

ddl-mysql-vertica-staging.vm

The ddl-mysql-vertica-staging.vm template generates DDL for HP Vertica staging tables. These include the full table definition, in addition to four columns used to define the staging data, including the operation code, sequence number and unique row ID. For example:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-vert:13306/hr' -pass password -template ddl-mysql-vertica-staging.vm -db hr
/*
SQL generated on Wed Mar 25 14:35:40 UTC 2026 by ./ddlscan utility of Tungsten

url = jdbc:mysql://tr-vert:13306/hr
user = tungsten
dbName = hr
*/
CREATE SCHEMA "hr";

DROP TABLE "hr"."stage_xxx_countries";

CREATE TABLE "hr"."stage_xxx_countries"
(
"tungsten_opcode" CHAR(1) ,
"tungsten_seqno" INT ,
"tungsten_row_id" INT ,
"tungsten_commit_timestamp" TIMESTAMP ,
"country_id" CHAR(2) ,
"country_name" VARCHAR(40) ,
"region_id" INT ) ORDER BY "tungsten_seqno", "tungsten_row_id";

ddl-mysql-vertica.vm

The ddl-mysql-vertica.vm template generates DDL for generating tables within an HP Vertica database from an existing MySQL database schema. For example:
shell> ddlscan -user tungsten -url 'jdbc:mysql://tr-vert1:13306/hr' -pass password -template ddl-mysql-vertica.vm -db hr
/*
SQL generated on Wed Mar 25 14:39:49 UTC 2026 by ./ddlscan utility of Tungsten

url = jdbc:mysql://tr-vert1:13306/test
user = tungsten
dbName = hr
*/
CREATE SCHEMA "hr";

DROP TABLE "hr"."countries";

CREATE TABLE "hr"."countries"
(
"country_id" CHAR(2) ,
"country_name" VARCHAR(40) ,
"region_id" INT )
ORDER BY "country_id"
;
Because Vertica does not explicitly support primary keys, a default projection for the key order is created based on the primary key of the source table.
The templates translates different datatypes as follows:
MySQL DatatypeVertica Datatype
DATETIMEDATETIME
TIMESTAMPTIMESTAMP
DATEDATE
TIMETIME
TINYINTTINYINT
SMALLINTSMALLINT
MEDIUMINTINT
INTINT
BIGINTINT
VARCHARVARCHAR
CHARCHAR
BINARYBINARY
VARBINARYVARBINARY
TEXT, TINYTEXT, MEDIUMTEXT, LONGTEXTVARCHAR(65000)
BLOB, TINYBLOB, MEDIUMBLOB, LONGBLOBVARBINARY(65000)
FLOATFLOAT
DOUBLEDOUBLE PRECISION
ENUMVARCHAR
SETVARCHAR(4000)
BIT(1)BOOLEAN
BITCHAR(64)
In addition, the following considerations should be taken into account:
  • DECIMAL MySQL type is not supported.
  • TEXT types in MySQL are converted to a VARCHAR in Vertica of the maximum supported size.
  • BLOB types in MySQL are converted to a VARBINARY in Vertica of the maximum supported size.
  • SET types in MySQL are converted to a VARCHAR in Vertica of 4000 characters, designed to work in tandem with the settostring filter.
  • ENUM types in MySQL are converted to a VARCHAR in Vertica of the size of the longestENUM value, designed to work in tandem with the enumtostring filter.