9.19.5. thl purge Command

The purge command to the thl command deletes sequence number information from the THL files.

thl purge
[-low # ] | [-from # ] | [-high # ] | [-to # ]
[-y ] [-no-checksum ]

Warning

Purging all data requires that the THL information either be recreated from the source table, or reloaded from the Primary replicator.

The purge command deletes the THL data according to the following rules:

  • Without any specification, a purge command will delete all of the stored THL information.

  • When only -high (or -to) is specified, delete all the THL data up to and including the specified sequence number.

  • When only -low (-from) is specified, delete all the THL data from and including the specified sequence number.

  • With a range specification, using one or both of the -low and -high options (or -from and -high options), the range of sequences will be purged. The rules are the same as for the list command, enabling purge from the start to a sequence, from a sequence to the end, or all the sequences within a given range. The ranges must be on the boundary of one or more log files. It is not possible to delete THL data from the middle of a given file.

For example, consider the following list of THL files provided by thl index:

shell> thl index
LogIndexEntry thl.data.0000000377(5802:5821)
LogIndexEntry thl.data.0000000378(5822:5841)
LogIndexEntry thl.data.0000000379(5842:5861)
LogIndexEntry thl.data.0000000380(5862:5881)
LogIndexEntry thl.data.0000000381(5882:5901)
LogIndexEntry thl.data.0000000382(5902:5921)
LogIndexEntry thl.data.0000000383(5922:5941)
LogIndexEntry thl.data.0000000384(5942:5961)
LogIndexEntry thl.data.0000000385(5962:5981)
LogIndexEntry thl.data.0000000386(5982:6001)
LogIndexEntry thl.data.0000000387(6002:6021)
LogIndexEntry thl.data.0000000388(6022:6041)
LogIndexEntry thl.data.0000000389(6042:6061)
LogIndexEntry thl.data.0000000390(6062:6081)
LogIndexEntry thl.data.0000000391(6082:6101)
LogIndexEntry thl.data.0000000392(6102:6121)
LogIndexEntry thl.data.0000000393(6122:6141)
LogIndexEntry thl.data.0000000394(6142:6161)
LogIndexEntry thl.data.0000000395(6162:6181)
LogIndexEntry thl.data.0000000396(6182:6201)
LogIndexEntry thl.data.0000000397(6202:6221)
LogIndexEntry thl.data.0000000398(6222:6241)
LogIndexEntry thl.data.0000000399(6242:6261)
LogIndexEntry thl.data.0000000400(6262:6266)

The above shows a range of THL sequences from 5802 to 6266.

To delete all of the THL from the start of the list, sequence no 5802, to 6021 (inclusive), use the -high option (or -to option) to specify the highest number to be removed (6021):

shell> thl purge -to 6021
WARNING: The purge command will break replication if you delete all
events or delete events that have not reached all slaves.
Are you sure you wish to delete these events [y/N]?
y
Deleting events where SEQ# <=6021
2025-02-20 16:31:36,235 [ - main] INFO  thl.THLManagerCtrl Transactions deleted

Running a thl index, sequence numbers from 6022 to 6266 are still available:

shell> thl index
LogIndexEntry thl.data.0000000388(6022:6041)
LogIndexEntry thl.data.0000000389(6042:6061)
LogIndexEntry thl.data.0000000390(6062:6081)
LogIndexEntry thl.data.0000000391(6082:6101)
LogIndexEntry thl.data.0000000392(6102:6121)
LogIndexEntry thl.data.0000000393(6122:6141)
LogIndexEntry thl.data.0000000394(6142:6161)
LogIndexEntry thl.data.0000000395(6162:6181)
LogIndexEntry thl.data.0000000396(6182:6201)
LogIndexEntry thl.data.0000000397(6202:6221)
LogIndexEntry thl.data.0000000398(6222:6241)
LogIndexEntry thl.data.0000000399(6242:6261)
LogIndexEntry thl.data.0000000400(6262:6266)

To delete the last two THL files, specify the sequence number at the start of the file, 6242 to the -low option ( or the -from option) to specify the sequence number:

shell> thl purge -from 6242 -y
WARNING: The purge command will break replication if you delete all
events or delete events that have not reached all slaves.
Deleting events where SEQ# >= 6242
2025-02-20 16:40:42,463 [ - main] INFO  thl.THLManagerCtrl Transactions deleted

A thl index shows the sequence as removed:

shell> thl index
LogIndexEntry thl.data.0000000388(6022:6041)
LogIndexEntry thl.data.0000000389(6042:6061)
LogIndexEntry thl.data.0000000390(6062:6081)
LogIndexEntry thl.data.0000000391(6082:6101)
LogIndexEntry thl.data.0000000392(6102:6121)
LogIndexEntry thl.data.0000000393(6122:6141)
LogIndexEntry thl.data.0000000394(6142:6161)
LogIndexEntry thl.data.0000000395(6162:6181)
LogIndexEntry thl.data.0000000396(6182:6201)
LogIndexEntry thl.data.0000000397(6202:6221)
LogIndexEntry thl.data.0000000398(6222:6241)

The confirmation message can be bypassed by using the -y option, which implies that the operation should proceed without further confirmation.

The optional argument -no-checksum ignores the checksum information on events in the event that the checksum is corrupt.

When purging, the THL files must be writeable; the replicator must either be offline or stopped when the purge operation is completed.

A purge operation may fail for the following reasons:

  • Fatal error: The disk log is not writable and cannot be purged.

    The replicator is currently running and not in the OFFLINE state. Use trepctl offline to release the write lock n the THL files.

  • Fatal error: Deletion range invalid; must include one or both log end points: low seqno=0 high seqno=1000

    An invalid sequence number or range was provided. The purge operation will refuse to purge events that do not exist in the THL files and do not match a valid file boundary, i.e. the low figure must match the start of one file and the high the end of a file. Use thl index to determine the valid ranges.

9.19.5.1. Purging THL Log Information on a Replica

Warning

Purging the THL on a Replica node can potentially remove information that has not yet been applied to the database. Please check and ensure that the THL data that you are purging has been applied to the database before continuing.

The THL files can be explicitly purged to recover disk space, but you should ensure that the currently applied sequence no to the database is not purged, and that additional hosts are not reading the THL information.

To purge the logs on a Replica node:

  1. Determine the highest sequence number from the THL that you want to delete. To purge the logs up until the latest sequence number, you can use trepctl to determine the highest applied sequence number:

    shell> trepctl services
    Processing services command...
    NAME              VALUE
    ----              -----
    appliedLastSeqno: 3672
    appliedLatency  : 331.0
    role            : slave
    serviceName     : alpha
    serviceType     : local
    started         : true
    state           : ONLINE
    Finished services command...
  2. Shun the Replica datasource and put the replicator into the offline state using cctrl:

    shell> cctrl
    [LOGICAL] /alpha > datasource host1 shun
    [LOGICAL] /alpha > replicator host1 offline

    Important

    NEVER Shun the Primary datasource!

  3. Use the thl command to purge the logs up to the specified transaction sequence number. You will be prompted to confirm the operation:

    shell> thl purge -high 3670
    WARNING: The purge command will break replication if you delete all events or »
       delete events that have not reached all slaves.
    Are you sure you wish to delete these events [y/N]?
    y
    Deleting events where SEQ# <=3670
    2024-04-16 14:09:42,384 [ - main] INFO  thl.THLManagerCtrl Transactions deleted
  4. Recover the host back into the cluster:

    shell> cctrl
    [LOGICAL] /alpha > datasource host1 recover

You can now check the current THL file information:

shell> thl index
LogIndexEntry thl.data.0000000024(3240:3672)

9.19.5.2. Purging THL Log Information on a Primary

Warning

Purging the THL on a Primary node can potentially remove information that has not yet been applied to the Replica databases. Please check and ensure that the THL data that you are purging has been applied to the database on all Replicas before continuing.

Important

If the situation allows, it may be better to switch the Primary role to a current, up-to-date Replica, then perform the steps to purge THL from a Replica on the old Primary host using Section 9.19.5.1, “Purging THL Log Information on a Replica”.

Warning

Follow the below steps with great caution! Failure to follow best practices will result in Replicas unable to apply transactions, forcing a full re-provisioning. For those steps, please see Section 6.6.1.2, “Provision or Reprovision a Replica”.

The THL files can be explicitly purged to recover disk space, but you should ensure that the currently applied sequence no to the database is not purged, and that additional hosts are not reading the THL information.

To purge the logs on a Primary node:

  1. Determine the highest sequence number from the THL that you want to delete. To purge the logs up until the latest sequence number, you can use trepctl to determine the highest applied sequence number:

    shell> trepctl services
    Processing services command...
    NAME              VALUE
    ----              -----
    appliedLastSeqno: 3675
    appliedLatency  : 0.835
    role            : master
    serviceName     : alpha
    serviceType     : local
    started         : true
    state           : ONLINE
    Finished services command...
  2. Set the cluster to MAINTENANCE mode and put the replicator into the offline state using cctrl:

    shell> cctrl
    [LOGICAL] /alpha > set policy maintenance
    [LOGICAL] /alpha > replicator host1 offline
  3. Use the thl command to purge the logs up to the specified transaction sequence number. You will be prompted to confirm the operation:

    shell> thl purge -high 3670
    WARNING: The purge command will break replication if you delete all events or »
       delete events that have not reached all slaves.
    Are you sure you wish to delete these events [y/N]?
    y
    Deleting events where SEQ# <=3670
    2013-04-16 14:09:42,384 [ - main] INFO  thl.THLManagerCtrl Transactions deleted
  4. Set the cluster to AUTOMATIC mode and put the replicator online using cctrl:

    shell> cctrl
    [LOGICAL] /alpha > replicator host1 online
    [LOGICAL] /alpha > set policy automatic

You can now check the current THL file information:

shell> thl index
LogIndexEntry thl.data.0000000024(3240:3672)