Release Notes 1.5.4 is a maintenance release that adds important bug fixes to the Tungsten 1.5.3 release currently in use by most Tungsten customers. It contains the following key improvements:
Introduces quorum into Tungsten clusters to help avoid split brain problems due to network partitions. Cluster members vote whenever a node becomes unresponsive and only continue operating if they are in the majority. This feature greatly reduces the chances of multiple live masters.
Enables automatic restart of managers after network hangs that disrupt communications between managers. This feature enables clusters to ride out transient problems with physical hosts such as storage becoming inaccessible or high CPU usage that would otherwise cause cluster members to lose contact with each other, thereby causing application outages or manager non-responsiveness.
Adds "witness-only managers" which replace the previous witness hosts. Witness-only managers participate in quorum computation but do not manage a DBMS. This feature allows 2 node clusters to operate reliably across Amazon availability zones and any environment where managers run on separate networks.
Numerous minor improvements to cluster configuration files to eliminate and/or document product settings for simpler and more reliable operation.
Continuent recommends that customers who are awaiting specific fixes for 1.5.3 release consider upgrade to Release Notes 1.5.4 as soon as it is generally available. All other customers should consider upgrade to Release Notes 2.0.1 as soon as it is convenient. In addition, we recommend all new projects start out with version 2.0.1.
The following changes have been made to Continuent Tungsten and may affect existing scripts and integration tools. Any scripts or environment which make use of these tools should check and update for the new configuration:
If a manager leaves a group due to a brief outage, and does not restart, it remains stranded from the rest of the group but 'thinks' it's still a part of the group. This contributed to the main cause of hanging/restarts during operations.