Understanding Load Balancing Algorithms & Techniques

Learn about the different types of algorithms & techniques used to optimize load balancing in networks, such as round-robin scheduling & weighted random algorithms.

A load balancer is essentially software or a hardware device that keeps traffic equally distributed among the servers. It stops one single server from getting overloaded.

What are Load-Balancing Algorithms?

A predefined set of rules or logic can be set up to distribute the traffic among the servers and this constitutes the algorithm.

There are two types of load-balancing algorithms. Dynamic load balancing is when the algorithms consider the current state of each server before the load is distributed. Static load balancing distributes traffic without making any adjustments. Static algorithms, sometimes, send equally divided traffic to every server in a group; however, it also sends traffic in a specified order or at random sometimes.

Load Balancer & Distribution of Client Traffic Across Servers

Load balancing refers to the division of incoming network traffic efficiently across different backend servers that constitute a group. This group is also called a server farm or server pool.

High-traffic websites of today sometimes serve thousands or even millions of simultaneous user requests and return the correct text/images/video/application data quickly and reliably. To meet these high volumes by scaling up cost-effectively, best practices recommend adding more servers.

Read our Blog “Load Balancing Microservices Architecture Performance”.

A load balancer works like a “traffic cop” sitting in front of the servers and routing requests from clients across the servers that are capable of fulfilling these requests quickly by utilizing the capacity properly so that no one server is overworked. Overloading a single server can degrade performance. If one server goes down, the load balancer sends the traffic to the remaining online servers. If a new server has been added to the group, the load balancer will start sending requests to it from the next time onwards.

Functions of the Load Balancer:

Distributes network load efficiently across different connected servers
Ensures high availability/reliability by sending requests only to online servers
Allows flexibility to add or remove servers according to the demand

Load Balancing Techniques

Round Robin Method

This is a simple and commonly-used load balancing algorithm. Client requests are distributed to the application servers in a simple rotation method. With three application servers, the first request is sent to the first server in the list, the second one to the second server, the third one to the third server, and so on.

This method is the most appropriate one for predictable user request streams that are spread across a server farm where members have approximately equal processing capabilities and available network bandwidth and storage.

Weighted Round Robin Method

This is similar to the round-robin method but adds the ability to spread the network load to the servers according to the relative capacity of every server. This method is used when diverting incoming user requests across many servers that have different capabilities/available resources. The administrator assigns specific weight to a server indicating their relative traffic-handling capability, according to which they are chosen.

If server #1 is twice as powerful as server #2 and server #3, the first one is provisioned with a higher weight and the other two are assigned the same, lower, weight. If there are 5 sequential client requests, the first two are directed to server #1, the third to server #2, the fourth to server #3 and the fifth would once again be diverted to server #1, and so on.

Also read: “Service Discovery in Microservices systems”

Least Connection Method

This is a dynamic load-balancing algorithm in which client requests are distributed to the server having the least number of active connections at the instant when the request is received. With servers having similar specifications, one server can become overloaded when the connections are longer lived because this method takes into consideration the active connection load. This is suited when incoming requests have varying connection times and the servers are relatively similar in terms of processing power and resources.

Weighted Least Connection Method

This method builds on the least connection method when the application server characteristics are different. The administrator assigns a weight to each server based on the relative processing power and resources of each server on the farm. The load-balancing decisions depend on active connections and the assigned server weights. As an example, with two servers having the lowest number of connections, the server having the highest weight is chosen.

Resource Based/Adaptive Method

This method makes decisions as per status indicators retrieved from the backend servers. The status indicator is determined by a customized program/agent that runs on each server. Each server is queried regularly for status information and the dynamic weight of the server is set appropriately.

Call us at SayOne today!

Here, the load balancing method essentially performs a detailed health check on the server. This fits any situation where health information is required from each server to make decisions regarding load-balancing. This method is suitable for an application where the workload is not constant and detailed application performance determines the server's health.

Resource Based (SDN Adaptive) Method

SDN adaptive is a method that uses knowledge from Layers 2, 3, 4 and 7 and requires input from an SDN controller. This helps to make optimized traffic distribution decisions. Information about the server status, application (running on the servers) status, network infrastructure health, and congestion level, all play a part in the load-balancing decision. This is suited for deployments including an SDN controller.

Fixed Weighting Load Balancing Method

Here, the administrator assigns a weight to a server according to its relative traffic-handling capability. The server with the highest weight will get all the traffic. If that server fails, all the traffic is directed to the server with the next highest weight. This method suits the type of workloads when a single server can handle all incoming requests, and when you have one or more spare servers that can handle the load if the active server fails.

Weighted Response Time Method

Weighted Response Time Method In this method, the server’s response time is used to calculate a server's weight. The server that responds the fastest receives the next user request. This is good for scenarios where the application response time is very important.

Source IP Hash Method

This method uses the source and destination IP addresses of the user request to generate a unique hash key used to allocate the client to a specific server. The key is regenerated in case the session is broken. Here, the request is directed to the server it was using previously. This method is used when a client needs to return to the same server for every successive connection.

URL Hash Method

This method is similar to source IP hashing. Here, the hash depends on the URL in the client request. Here, client requests to a specific URL are always sent to the same backend server.

Conclusion

The choice of load balancing technique depends on your requirements.

How SayOne can Help You

At SayOne, our integrated teams of developers service our clients with web and mobile applications that are fully aligned with the future of the business or organization. We design, develop and implement applications using Agile and DevOps methodologies. Our system model focuses on programs that are resilient, fortified, and highly reliable.

Understanding Load Balancing Algorithms & Techniques

Share This Article

Global Software Development Rates: An Overview

Table of Contents

What are Load-Balancing Algorithms?