Skip to main content
Tungsten Clustering

Making Schema Changes

Similar to the maintenance procedure, schema changes to an underlying dataserver may need to be performed on dataservers that are not part of an active dataservice. Although many schema changes, such as the addition, removal or modification of an existing table definition will be correctly replicated to Replicas, other operations, such as creating new indexes, adding/removing columns or migrating table data between table definitions, is best performed individually on each dataserver while it has been temporarily taken out of the dataservice.

As with all maintenance operations it is advisable to have fully tested your DDL in a staging environment. In some cases, the impact of DDL change is minimal and therefore can safely be applied to the Primary node and allowing the change to be replicated down to the Replicas.

In situations where the overhead of the DDL change would cause an outage to your application through table locking, use the rolling maintenance procedure below which is specific for DDL changes.

The basic process comprises of a number of steps, these are as follows:

  1. If the DDL adds or removes columns, then enable the colnames and dropcolumn filters
  2. If the DDL adds or removes tables, and you do not want to simply apply to the Primary and allow replication to handle it, then enable the replicate filters
  3. Perform schema changes following the process summarised in the table below
  4. Optionally, remove the filters enabled in the first step
note

The first two steps listed above are optional for Multi-Site/Active-Active topologies, the process will still work without this step however this may cause cross-site replication to go into an error state until the DDL changes have been applied to all hosts.

Therefore, within Multi-Site/Active-Active environments, instead of using the colnames and dropcolumn filters mentioned above, you may choose to temporarily stop the cross site replicators. Doing so will reduce the risk of errors in the replication flow whilst the maintenance is in progress, however you will introduce latency between clusters for the duration of the time the replicators are offline.

For Composite Active/Active topologies, it is not possible to isolate the cross-site replictions in the same way, therefore these steps are strongly advised as the resulting error state of the cross-site replicator may prevent a successful switch between Primary and Replica nodes.

Enable filters for column changes

The use of the colnames and dropcolumn filters allow you to make changes to the structure of tables, without impacting the flow of replication.

important

During these schema changes, and whilst the filters are in place, applications MUST be forwards and backwards compatible, but MUST NOT alter data within any columns that are filtered out from replication until after the process has been completed on all hosts, and the filters disabled. Data changes to filtered columns will cause data drift and inconsistencies, resulting in potentially unexpected behaviour

  • To enable the filters, first create a file called schemachange.json in a directory accessible by the OS user that the software is running as.

    Typically, this will be the tungsten user and the default location for this file will be /opt/continuent/share

    The file needs to contain a JSON block outlining ALL the columns being added and removed from all tables affected by the changes.

    In the example below, we are removing the column, operator_code and adding operator_desc to the system_operators table and adding the column action_date to the system_actions table:

    [
    {
    "schema": "ops",
    "table": "system_operators",
    "columns": ["operator_code","operator_desc"]
    },
    {
    "schema": "ops",
    "table": "system_actions",
    "columns": ["action_date"]
    }
    ]
  • Place your cluster into MAINTENANCE mode

    shell> cctrl
    cctrl> set policy maintenance
    Note

    If running a Composite Active/Passive or Composite Active/Active topology, issue the command at the top global level to place all clusters in MAINTENANCE, or execute individually within each cluster

  • Next, enable the filters within your configuration by adding the following two parameters to the tungsten.ini (if running in INI method) to the [defaults] section on EVERY cluster node:

    svc-extractor-filters=colnames,dropcolumn
    property=replicator.filter.dropcolumn.definitionsFile=/opt/continuent/share/schemachange.json

    Followed by tpm update to apply the changes:

  • Or, if running as a staging install:

    shell> cd staging_dir
    shell> tools/tpm update alpha \
    --svc-extractor-filters=colnames,dropcolumn \
    --property=replicator.filter.dropcolumn.definitionsFile=/opt/continuent/share/schemachange.json
  • Monitor replication to ensure there are no errors before continuing

Enable filters for adding/removing tables

The optional use of the replicate filter allows you to add and remove tables without impacting the flow of replication.

important

During these schema changes, and whilst the filter is in place, applications MUST be forwards and backwards compatible, but MUST NOT modify data in any new tables until after the process has been completed on all hosts, and the filters disabled. Data changes to filtered tables will cause data drift and inconsistencies, resulting in potentially unexpected behaviour.

  • Place your cluster into MAINTENANCE mode.

    shell> cctrl
    cctrl> set policy maintenance
    Note

    If running a Composite Active/Passive or Composite Active/Active topology, issue the command at the top global level to place all clusters in MAINTENANCE, or individually within each cluster.

  • Next, enable the filters within your configuration by adding the following two parameters to the tungsten.ini (if running in INI method) to the [defaults] section on EVERY cluster node.

    In this example we plan to ADD the table system_actions and REMOVE the table system_operations, both within the ops schema:

    svc-extractor-filters=replicate
    property=replicator.filter.replicate.ignore=ops.system_actions,ops.system_operations

    Followed by tpm update to apply the changes

  • Or, if running as a staging install:

    shell> cd staging_dir
    shell> tools/tpm update alpha \
    --svc-extractor-filters=replicate \
    --property=replicator.filter.replicate.ignore=ops.system_actions,ops.system_operations
  • Monitor replication to ensure there are no errors before continuing.

Apply DDL Changes

  • Follow the steps outlined in the table below to make the DDL changes to all nodes, in all clusters.

  • If filtering columns, once all the changes have been complete, edit the schemachange.json to contain an empty document:

    shell> echo "[]" > /opt/continuent/share/schemachange.json

    Then, restart the replicators:

    shell> replicator restart
  • If filtering tables, repeat the process of adding the replicate filter removing any tables from the ignore parameter that you have ADDED to your database.

  • You can optionally fully remove the filters if you wish by removing the entries from the configuration and re-running tpm update however it is also perfectly fine to leave them in place. There is a potentially small CPU overhead in very busy clusters by having the filters in place, but otherwise should not have any impact.

    It is advisable to monitor the system usage and make the decision based on your own business needs.

    Note

    Leaving the colnames filter in place will increase the size of THL on disk, therefore if disk space is limited in your environment, it would be advisable to remove these filters.

The following method assumes a schema update on the entire dataservice by modifying the schema on the Replicas first. The schema shows three datasources being updated in sequence, Replicas first, then the Primary.

StepDescriptionCommandhost1host2host3
1Initial statePrimaryReplicaReplica
2Set the Replica host2 offlinetrepctl -host host2 offlinePrimaryOfflineReplica
3Connect to dataserver for host2 and update schema

set sql_log_bin=0; then run ddl statements

PrimaryOfflineReplica
4Set the Replica onlinetrepctl -host host2 onlinePrimaryReplicaReplica
5Ensure the Replica (host2) has caught uptrepctl -host host2 statusPrimaryReplicaReplica
6Set the Replica host3 offlinetrepctl -host host3 offlinePrimaryReplicaOffline
7Connect to dataserver for host3 and update schema

set sql_log_bin=0; then run ddl statements

PrimaryReplicaOffline
8Set the Replica (host3) onlinetrepctl -host host3 onlinePrimaryReplicaReplica
9Ensure the Replica (host3) has caught uptrepctl -host host3 statusPrimaryReplicaReplica
10Switch Primary to host2switch to host2ReplicaPrimaryReplica
11Set the Replica host1 offlinetrepctl -host host1 offlineOfflinePrimaryReplica
12Connect to dataserver for host1 and update schema

set sql_log_bin=0; then run ddl statements

OfflinePrimaryReplica
13Set the Replica host1 onlinetrepctl -host host1 onlineReplicaPrimaryReplica
14Ensure the Replica (host1) has caught uptrepctl -host host1 statusPrimaryReplicaReplica
15Switch Primary back to host1 (Optional)switch to host1PrimaryReplicaReplica
note

With any schema change to a database, the database performance should be monitored to ensure that the change is not affecting the overall dataservice performance.