A common approach to horizontal scaling is to balance a workload across two or more IT resources to increase performance and capacity beyond what a single IT resource can provide. The load balancer mechanism is a runtime agent with logic fundamentally based on the aforementioned premise.
Beyond simple division of labor algorithms (Figure 1), load balancers can perform a range of specialized runtime workload distribution functions, such as:
- Asymmetric Distribution - Issuing larger workloads to IT resources with higher processing capacities.
- Workload Prioritization - Scheduling, queuing, discarding, and distributing workloads according to their priority levels.
- Content-Aware Distribution - Distributing requests to different IT resources as dictated by the content of each request.
A load balancer is programmed or configured with a set of performance and quality-of-service rules and parameters with the general objectives of optimizing IT resource usage, avoiding overloads, and maximizing throughput.
Load balancer mechanisms can exist as:
- Multi-Layer Network Switch
- Dedicated Load Balancer Hardware Appliance
- Dedicated Software-Based System (common in server operating systems)
- Service Agent (usually controlled by cloud management software)
The load balancer is typically located on the communication path between the IT resources generating the workload and the IT resources performing the workload processing. This mechanism can be designed as a transparent agent that remains hidden from, or as a proxy component that abstracts the IT resources performing the workload from their requestors.
Figure 1 - A load balancer implemented as a service agent transparently distributes incoming workload request messages across two redundant cloud service implementations. This, in turn, maximizes performance for the cloud service consumers.