Synchronized Operating State (Erl, Naserpour)
How can the availability and reliability of virtual servers be ensured when high availability and clustering technology is unavailable?
Problem
A cloud consumer may be prevented from utilizing high availability and clustering technology for its virtual servers or operating systems, thereby making them more vulnerable to failure.Solution
A composite failover system is created to not rely on clustering or high availability features but instead use heartbeat messages to synchronize virtual servers.Application
The heartbeat messages are processed by a specialized service agent and are exchanged between hypervisors, the hypervisor and virtual server, and the hypervisor and VIM.Mechanisms
Cloud Storage Device, Failover System, Hypervisor, Resource Replication, State Management Database, Virtual ServerCompound Patterns
Burst In, Burst Out to Private Cloud, Burst Out to Public Cloud, Elastic Environment, Infrastructure-as-a-Service (IaaS), Multitenant Environment, Platform-as-a-Service (PaaS), Private Cloud, Public Cloud, Resilient Environment, Software-as-a-Service (SaaS)Problem
Technical restrictions, licensing restrictions, or other reasons may prevent a cloud consumer from taking advantage of clustering and high availability technology and products. This can seriously jeopardize the availability and scalability of its cloud services and applications.
Solution
A system comprised of a set of mechanisms and relying on the use of heartbeat messages is established to emulate select features of clustering and high availability IT resources.
Figure 1 - Special heartbeat agents are employed to monitor heartbeat messages exchanged between the servers.
Application
Heartbeat messages are processed by a heartbeat monitor agents and are exchanged between:
- hypervisors
- each hypervisor and each virtual server
- each hypervisor and the central VIM
If an operating system is placed on a physical server, it needs to be converted into a virtual server prior to the issuance of heartbeat messages.
Figure 2 - The cloud architecture resulting from the application of this pattern.
- A virtual server is created from the physical server.
- The hypervisor proceeds to host the virtual server.
- The primary virtual server is equipped with fault tolerance and maintains a synchronized state via the use of heartbeat messages.
- The secondary server that shares the synchronized state is available in case the primary virtual server fails.
The application/service monitoring station monitors the servers and cloud services. In the event of failure, this station attempts recovery based on sequential pre-defined policies. If the primary server’s operating system fails, procedures are in place to avoid downtime.
Figure 3 - When the primary virtual server fails, along with its hosted cloud service, heartbeat messages are no longer transmitted. As a result, the hypervisor recognizes the failure and switches activity to the secondary virtual server that maintains the synchronized state. After the primary virtual server is back online, the hypervisor creates a new secondary for the new primary, and proceeds to save it as a synchronized non-active state.
NIST Reference Architecture Mapping
This pattern relates to the highlighted parts of the NIST reference architecture, as follows:
This pattern is covered in CCP Module 5: Advanced Cloud Architecture..
For more information regarding the Cloud Certified Professional (CCP) curriculum, visit www.cloudschool.com.
Arcitura IT Certified Professionals (AITCP)
Arcitura IT Certified Professionals (AITCP)
Arcitura IT Certified Professionals (AITCP)
Arcitura YouTube Channel
