Dynamic Scalability (Erl, Naserpour)
How can IT resources be scaled automatically in response to fluctuating demand?
ProblemIt is challenging to equip an IT resource to match its processing requirements. If the demand for the IT resource is below its capacity, then it is under-utilized and if the demand is above its capacity it is over-utilized or unable to meet the demand.
SolutionThe IT resource can be integrated with a reactive cloud architecture capable of automatically scaling it horizontally or vertically in response to fluctuating demand.
ApplicationDynamic horizontal scaling can be enabled via the use of pools of identical IT resources and components capable of dispersing and retracting workloads across each pool. Dynamic vertical scaling can be enabled via technology capable of swapping IT resource components at runtime.
MechanismsAutomated Scaling Listener, Cloud Storage Device, Cloud Usage Monitor, Hypervisor, Pay-Per-Use Monitor, Resource Replication, Virtual Server
Compound PatternsBurst In, Burst Out to Private Cloud, Burst Out to Public Cloud, Elastic Environment, Infrastructure-as-a-Service (IaaS), Multitenant Environment, Platform-as-a-Service (PaaS), Private Cloud, Public Cloud, Resilient Environment, Software-as-a-Service (SaaS)
Manually preparing or extending IT resources in response to workload fluctuations is time-intensive and unacceptably inefficient. Determining when to add new IT resources to satisfy anticipated workload peaks is often speculative and generally risky. These additional IT resources can either remain underutilized (and a failed financial investment), or fail to alleviate runtime performance and reliability problems when demand exceeds even the addition of their capacity.
Figure 1 - A non-dynamic cloud architecture in which vertical scaling is carried out in response to usage fluctuations (Part 1).
Figure 2 - A non-dynamic cloud architecture in which vertical scaling is carried out in response to usage fluctuations (Part 2).
- The cloud provider offers cloud services to cloud consumers.
- Cloud consumers can scale the cloud services, as needed.
- Over time, the number of cloud consumers increases.
- The cloud provider's virtual server is overwhelmed with the increased workload capacity.
- The cloud provider brings a new, higher-capacity server online to handle an increased workload.
- Because the required IT resources are not organized for sharing and are unprepared for allocation, the virtual server must have the operating system, required applications and cloud services installed after being created.
- Once the new server is ready, the old server is taken offline.
- Now service requests are redirected to the new server.
- After the peak usage period has ended, the number of cloud consumers and service requests naturally decrease.
- Without properly implementing a process of underutilized IT resource recovery, the new server's sizable workload capacity will not be fully utilized.
A system of predefined scaling conditions that trigger the dynamic allocation of IT resources can be introduced. The IT resources are allocated from resource pools to allow for variable utilization as dictated by demand fluctuations. Unneeded IT resources are efficiently reclaimed without requiring manual interaction.
Figure 3 - A sample dynamic scaling process.
The fundamental Dynamic Scaling pattern primarily relies on the application of the Resource Pool pattern and the implementation of the automated scaling listener.
The automated scaling listener is configured with workload thresholds that determine when new IT resources need to be included in the workload processing. The automated scaling listener can further be provided with logic that allows it to verify the extent of additional IT resources a given cloud consumer is entitled to, based on its leasing arrangement with the cloud provider.
The following types of dynamic scaling are common:
- Dynamic Horizontal Scaling - In this type of dynamic scaling the number of IT resource instances is scaled to handle fluctuating workloads. The automatic scaling listener monitors requests and, if scaling is required, signals a resource replication mechanism to initiate the duplication of the IT resources, as per requirements and permissions. (The example at the end of this section demonstrates this type of scaling.)
- Dynamic Vertical Scaling - This type of scaling occurs when there is a need to increase the processing capacity of a single IT resource. For instance, if a virtual server is being overloaded, it can dynamically have its memory increased or it may have a processing core added.
- Dynamic Relocation - The IT resource is relocated to a higher capacity host. For example, there may be a need to move a cloud service database from a tape-based SAN storage device with 4 Gbps I/O capacity to another disk-based SAN storage device with 8 Gbps I/O capacity.
The following example demonstrates dynamic horizontal scaling.
Figure 4 - An example of a dynamic scaling architecture involving an automated scaling mechanism.
- Cloud service consumers are sending requests to a cloud service.
- The automated scaling listener monitors the cloud service to determine if pre-defined capacity thresholds are being exceeded.
- The number of service requests coming from cloud service consumers further increases.
- The workload exceeds the performance thresholds of the automated scaling listener. It determines the next course of action based on a pre-defined scaling policy.
- If the cloud service implementation is deemed eligible for additional scaling, the automated scaling listener initiates the scaling process.
- The automated scaling listener sends a signal to the resource replication mechanism.
- The resource replication mechanism then creates more instances of the cloud service.
- Now that the increased workload is accommodated, the automated scaling listener resumes monitoring and the detracting or adding of necessary IT resources.
NIST Reference Architecture Mapping
This pattern relates to the highlighted parts of the NIST reference architecture, as follows: