8.4.3. Switch and Failover Steps for Composite Services

Switch and Failover operations for composite services are handled by a different set of high-level methods than local (physical) dataservice.

Here are the steps taken, in the exact order taken, to execute a switch or failover operation in a composite service:

  1. SWITCH ONLY: make sure that there is an online composite primary.

  2. FAILOVER ONLY: identify the failed primary. NOTE: TUNGSTEN WILL NOT ALLOW ANY CONNECTIONS TO A FAILED PRIMARY. APPLICATIONS WILL APPEAR TO HANG.

  3. If a target datasource is passed in, ensure that it exists and that it is not the current primary.

  4. If a target is not passed in, evaluate each composite replica and, for each of those composite replicas, evaluate the relay in the physical service for the composite replica and find the relay with the highest stored sequence number.

  5. Get the current policy mode and store it away. Then set the policy mode for the composite service to maintenance. This means that all physical services will also be in maintenance mode.

  6. SWITCH ONLY: put the composite primary datasource into the offline state. This has the effect of also putting the physical service's primary datasource into the offline state.

  7. SWITCH ONLY: AT THIS POINT NO MORE CONNECTIONS TO THE COMPOSITE/PHYSICAL PRIMARY ARE POSSIBLE. APPLICATIONS WILL APPEAR TO HANG.

  8. FAILOVER ONLY: shun the composite primary datasource. If the physical service for the failed composite primary is available, this will have the effect of shunning the physical primary datasource as well.

  9. Put the composite target datasource into the offline state. This has the effect of also putting the physical relay on the target site into the offline state.

  10. Purge transactions on the current physical primary if it's available.

  11. SWITCH ONLY: Do a flush operation, as in the physical service switch operation, and wait until the flushed transaction is present on the target relay replicator. 

  12. SWITCH ONLY: AT THIS POINT ALL TRANSACTIONS THAT WERE COMMITTED ON THE PRIMARY ARE AVAILABLE ON THE TARGET.

  13. SWITCH ONLY: Put the replicator for the physical source offline.

  14. FAILOVER ONLY: When we have a multi-site composite service, there may be a relay in the service that successfully replicated more transactions than the relay that has been chosen to be the new primary. In this case, we'll need to have the new primary catch up from that relay or else when the system goes online, the further ahead relay will fail to go online because it will have a seqno higher than the new primary. So check whether or not we have such a case and catch up from an 'intermediate' replicator if necessary.

  15. FAILOVER ONLY: AT THIS POINT ALL TRANSACTIONS THAT ARE AVAILABLE, FROM THE FAILED PRIMARY, WILL NOW BE COMMITTED ON THE TARGET.

  16. Put the target composite datasource offline. This has the effect of putting the target's physical relay offline as well.

  17. Put the replicator for the target's physical relay offline. 

  18. Change the role of the target's physical relay replicator to primary.

  19. SWITCH ONLY: Change the role of the source's physical primary datasource to relay.

  20. Change the role of the target's physical service relay datasource to primary.

  21. Put the replicator for the new physical primary online.

  22. Put the datasource for the new physical primary online.

  23. SWITCH ONLY: put the replicator for the previous primary datasource into the online state as a relay replicator.

  24. SWITCH ONLY: put the source physical primary datasource into the offline state.

  25. SWITCH ONLY: set the source physical relay datasource as a primary.

  26. SWITCH ONLY: put the new physical relay datasource from the source into the online state.

  27. SWITCH ONLY: set the source composite datasource role from composite primary to composite replica.

  28. Put the target composite datasource offline.

  29. Set the target composite datasource from a composite replica to a composite primary.

  30. Put the target composite datasource online. This has the effect of also putting the new physical primary into the online state.

  31. AT THIS POINT THE COMPOSITE AND PHYSICAL PRIMARY ARE ONLINE AND CONNECTION REQUESTS WILL BE ACCEPTED.

  32. SWITCH ONLY: put the source composite datasource, which is now a replica, into the online state. This has the effect of setting the physical relay datasource into the online state as well.

  33. Reconfigure all of the remaining composite replicas to point at the new primary. This process involves the following steps:

    1. Identify the physical relay for each replica's physical service.

    2. Put the relay replicator offline.

    3. Set the role of the replicator to relay.

    4. Put the relay replicator into the online state.

  34. Take the saved policy manager mode and, if it's not maintenance, put the composite cluster into that saved mode.