Non-Disruptive Service Relocation (Erl, Naserpour)
How can cloud service activity be temporarily or permanently relocated without causing service interruption?
ProblemThere are circumstances under which redirecting cloud service activity or relocating an entire cloud service implementation is required or preferable. However, diverting service activity or relocating a cloud service implementation can cause outage, thereby disrupting the availability of the cloud service.
SolutionA system can be established whereby cloud service redirection or relocation is carried out at runtime by temporarily creating a duplicate implementation before the original implementation is deactivated or removed.
ApplicationVirtualization technology is used by the system to enable the duplication and migration of the cloud service implementation across different locations in realtime.
MechanismsCloud Storage Device, Cloud Usage Monitor, Hypervisor, Pay-Per-Use Monitor, Resource Replication, SLA Management System, SLA Monitor, Virtual Server
Compound PatternsBurst In, Burst Out to Private Cloud, Burst Out to Public Cloud, Elastic Environment, Infrastructure-as-a-Service (IaaS), Multitenant Environment, Platform-as-a-Service (PaaS), Private Cloud, Public Cloud, Resilient Environment, Software-as-a-Service (SaaS)
A cloud service can become unavailable due to a number of reasons, such as:
- The cloud service encounters more runtime usage demand that it has processing capacity to handle.
- The cloud service implementation needs to undergo a maintenance update that mandates a temporary outage.
- The cloud service implementation needs to be permanently migrated to a new physical server host.
Cloud service consumer requests are rejected if a cloud service becomes unavailable, which can potentially result in exception conditions. Rendering the cloud service temporarily unavailable to cloud consumers is not preferred even if the outage is planned.
A system is established by which a pre-defined event triggers the duplication or migration of a cloud service implementation at runtime, thereby avoiding any disruption in service for cloud consumers.
An alternative to scaling cloud services in or out with redundant implementations, cloud service activity can be temporarily diverted to another hosting environment at runtime by adding a duplicate implementation onto a new host. Cloud service consumer requests can similarly be temporarily redirected to a duplicate implementation when the original implementation needs to undergo a maintenance outage. The relocation of the cloud service implementation and any cloud service activity can also be permanent to accommodate cloud service migrations to new physical server hosts.
A key aspect to the underlying architecture is that the system ensures that the new cloud service implementation is successfully receiving and responding to cloud service consumer requests before the original cloud service implementation is deactivated or removed.
A common approach is to employ the live VM migration component to move the entire virtual server instance hosting the cloud service. The automated scaling listener and/or the load balancer mechanisms can be used to trigger a temporary redirection of cloud service consumer requests in response to scaling and workload distribution requirements. In this case either mechanism can contact the VIM to initiate the live VM migration process.
Figure 1 - An example of a scaling-based application of the Non-Disruptive Service Relocation pattern (Part 1).
Figure 2 - An example of a scaling-based application of the Non-Disruptive Service Relocation pattern (Part 2).
Figure 3 - An example of a scaling-based application of the Non-Disruptive Service Relocation pattern (Part 3).
- The automated scaling listener monitors the workload for a cloud service.
- As the workload increases a predefined threshold within the cloud service is reached.
- The automated scaling listener signals the VIM to initiate the relocation.
- The VIM signals both the origin and destination hypervisors to carry out a runtime relocation via the use of a live VM migration program (not shown).
- A second copy of the virtual server and its hosted cloud service are created via the destination hypervisor on Physical Server B.
- The state of both virtual server instances is synchronized.
- The first virtual server instance is removed from Physical Server A after it is confirmed that cloud service consumer requests are being successfully exchanged with the cloud service on Physical Server B.
- Cloud service consumer requests are only sent to the cloud service on Physical Server B from hereon.
Depending on the location of the virtual server's disks and configuration, this migration can happen in one of two ways:
- If the virtual server disks are stored on a local storage device or on non-shared remote storage devices attached to the source host, then a copy of the virtual server disks is created on the destination host (either on a local or remote shared/non-shared storage device). After the copy has been created, both virtual server instances are synchronized and virtual server files are subsequently removed from the origin host.
- If the virtual server's files are stored on a remote storage device shared between origin and destination hosts, there is no need to create the copy of virtual server disks. In this case, the ownership of the virtual server is simply transferred from the origin to the destination physical server host, and the virtual server's state is automatically synchronized.
Note that this this pattern conflicts and cannot be applied together with the Direct I/O Access pattern. A virtual server with direct I/O access is locked into its physical server host and cannot be moved to other hosts in this fashion.
Furthermore, the Persistent Virtual Network Configurations pattern may need to be applied in support of this pattern so that by moving the virtual server, its defined network configuration is not inadvertently lost, which would prevent cloud service consumers from being able to connect to the virtual server.
NIST Reference Architecture Mapping
This pattern relates to the highlighted parts of the NIST reference architecture, as follows: