Service Load Balancing (Erl, Naserpour)
How can a cloud service accommodate increasing workloads?
ProblemA single cloud service implementation has a finite capacity, which leads to runtime exceptions, failure and performance degradation when its processing thresholds are exceeded.
SolutionRedundant deployments of the cloud service are created and a load balancing system is added to dynamically distribute workloads across cloud service implementations.
ApplicationThe duplicate cloud service implementations are organized into a resource pool. The load balancer is positioned as an external component or may be built-in, allowing hosting servers to balance workloads among themselves.
Compound PatternsBurst In, Burst Out to Private Cloud, Burst Out to Public Cloud, Cloud Balancing, Elastic Environment, Infrastructure-as-a-Service (IaaS), Multitenant Environment, Platform-as-a-Service (PaaS), Private Cloud, Public Cloud, Resilient Environment, Software-as-a-Service (SaaS)
Regardless of the processing capacity of its immediate hosting environment, a cloud service architecture may inherently be limited in its ability to accommodate high volumes of concurrent cloud service consumer requests. The cloud service’s processing restrictions may be such that it is unable to leverage underlying cloud-based IT resources that normally support dynamic scalability. For example, the processing restrictions may originate from its architectural design, the complexity of its application logic, or inhibitive programming algorithms it is required to carry out at runtime.
Regardless of the origin of its limitations, such a cloud service may be forced to reject cloud service consumer requests when its processing capacity thresholds are reached.
Figure 1 - A single cloud service implementation reaches its runtime processing capacity and consequently rejects subsequent cloud service consumer requests.
Redundant implementations of the cloud service are created, each located on a different hosting server. A load balancer is utilized to intercept cloud service consumer requests in order to evenly distribute them across the multiple cloud service implementations.
Figure 2 - The load balancing agent intercepts messages sent by cloud service consumers (1) and forwards the messages at runtime to the virtual servers so that the workload processing is horizontally scaled (2).
Depending on the anticipated workload and the processing capacity of hosting server environments, multiple instances of each cloud service implementation may be generated in order to establish pools of cloud services that can more efficiently respond to high volumes of concurrent requests.
The load balancer may be positioned independently from the cloud services and their hosting servers (as shown in Figure 2), or it may be built-in as part of the application or servers environment.. In the latter case, a primary server with the load balancing logic can communicate with neighboring servers to balance the workload (as shown in Figure 3).
Figure 3 - Cloud consumer requests are sent to Cloud Service A on Virtual Server A (1). The cloud service implementation includes built-in load balancing logic that is capable of distributing requests to the neighboring Cloud Service A implementations on Virtual Servers B and C (2).
For this pattern to be applied, a server group needs to be created and configured, so that server group members can be associated with the load balancer. The paths of cloud service consumer requests to be sent to through the load balancer need to be set and the load balancer needs to be configured to evaluate each cloud service implementation’s capacity on a regular basis.
NIST Reference Architecture Mapping
This pattern relates to the highlighted parts of the NIST reference architecture, as follows: