5.5.8. Disk vs. Memory Parallel Queues

5.5.8. Disk vs. Memory Parallel Queues
Prev	^Up	5.5. Deploying Parallel Replication	Next

5.5.8. Disk vs. Memory Parallel Queues

Channels receive transactions through a special type of queue, known as a parallel queue. Tungsten offers two implementations of parallel queues, which vary in their performance as well as the requirements they may place on hosts that operate parallel apply. You choose the type of queue to enable using the --svc-parallelization-type option.

Warning

Do not change the parallel queue type without setting the replicator offline cleanly. See the procedure later in this page for more information.

Disk Parallel Queue (disk option)

A disk parallel queue uses a set of independent threads to read from the Transaction History Log and feed short in-memory queues used by channels. Disk queues have the advantage that they minimize memory required by Java. They also allow channels to operate some distance apart, which improves throughput. For instance, one channel may apply a transaction that committed 2 minutes before the transaction another channel is applying. This separation keeps a single slow transaction from blocking all channels.

Disk queues minimize memory consumption of the Java VM but to function efficiently they do require pages from the Operating System page cache. This is because the channels each independently read from the Transaction History Log. As long as the channels are close together the storage pages tend to be present in the Operating System page cache for all threads but the first, resulting in very fast reads. If channels become widely separated, for example due to a high maxOfflineInterval value, or the host has insufficient free memory, disk queues may operate slowly or impact other processes that require memory.

Memory Parallel Queue (memory option)

A memory parallel queue uses a set of in-memory queues to hold transactions. One stage reads from the Transaction History Log and distributes transactions across the queues. The channels each read from one of the queues. In-memory queues have the advantage that they do not need extra threads to operate, hence reduce the amount of CPU processing required by the replicator.

When you use in-memory queues you must set the maxSize property on the queue to a relatively large value. This value sets the total number of transaction fragments that may be in the parallel queue at any given time. If the queue hits this value, it does not accept further transaction fragments until existing fragments are processed. For best performance it is often necessary to use a relatively large number, for example 10,000 or greater.

The following example shows how to set the maxSize property after installation. This value can be changed at any time and does not require the replicator to go offline cleanly:

Click the link below to switch examples between Staging and INI methods...

Show Staging

copy

shell> tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-replicator-6.1.25-6

shell> echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
The staging USER is tungsten

shell> echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
The staging HOST is db1

shell> echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-replicator-6.1.25-6

shell> ssh {STAGING_USER}@{STAGING_HOST}
shell> cd {STAGING_DIRECTORY}

copy

shell> ./tools/tpm configure alpha \
    --property=replicator.store.parallel-queue.maxSize=10000

copy

shell> ./tools/tpm update

copy

[alpha]
...
property=replicator.store.parallel-queue.maxSize=10000

Run the tpm command to update the software with the INI-based configuration:

copy

shell> tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-replicator-6.1.25-6

shell> echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-replicator-6.1.25-6

shell> cd {STAGING_DIRECTORY}

shell> ./tools/tpm update

For information about making updates when using an INI file, please see Section 9.4.4, “Configuration Changes with an INI file”.

You may need to increase the Java VM heap size when you increase the parallel queue maximum size. Use the --java-mem-size option on the tpm command for this purpose or edit the Replicator wrapper.conf file directly.

Warning

Memory queues are not recommended for production use at this time. Use disk queues.

tpm query staging
echo The staging USER is `tpm query staging| cut -d: -f1 | cut -d@ -f1`
echo The staging HOST is `tpm query staging| cut -d: -f1 | cut -d@ -f2`
echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
ssh {STAGING_USER}@{STAGING_HOST}
cd {STAGING_DIRECTORY}
./tools/tpm configure alpha \
    --property=replicator.store.parallel-queue.maxSize=10000
./tools/tpm update
[alpha]
...
property=replicator.store.parallel-queue.maxSize=10000
tpm query staging
echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
cd {STAGING_DIRECTORY}
./tools/tpm update

Show Copy-friendly Text

Prev	Up	Next
5.5.7. Controlling Assignment of Shards to Channels	^Level	5.6. Batch Loading for Data Warehouses

Continuent Documentation