Subscribe to our Blog
We're committed to your privacy. SayOne uses the information you provide to us to contact you about our relevant content, products, and services. check out our privacy policy.
Akhil SundarMarch 30, 20237 min read
Generating table of contents...
A load balancer is essentially software or a hardware device that keeps traffic equally distributed among the servers. It stops one single server from getting overloaded.
A predefined set of rules or logic can be set up to distribute the traffic among the servers and this constitutes the algorithm.
There are two types of load-balancing algorithms. Dynamic load balancing is when the algorithms consider the current state of each server before the load is distributed. Static load balancing distributes traffic without making any adjustments. Static algorithms, sometimes, send equally divided traffic to every server in a group; however, it also sends traffic in a specified order or at random sometimes.
Load balancing refers to the division of incoming network traffic efficiently across different backend servers that constitute a group. This group is also called a server farm or server pool.
High-traffic websites of today sometimes serve thousands or even millions of simultaneous user requests and return the correct text/images/video/application data quickly and reliably. To meet these high volumes by scaling up cost-effectively, best practices recommend adding more servers.
Read our Blog “Load Balancing Microservices Architecture Performance”.
A load balancer works like a “traffic cop” sitting in front of the servers and routing requests from clients across the servers that are capable of fulfilling these requests quickly by utilizing the capacity properly so that no one server is overworked. Overloading a single server can degrade performance. If one server goes down, the load balancer sends the traffic to the remaining online servers. If a new server has been added to the group, the load balancer will start sending requests to it from the next time onwards.
This is a simple and commonly-used load balancing algorithm. Client requests are distributed to the application servers in a simple rotation method. With three application servers, the first request is sent to the first server in the list, the second one to the second server, the third one to the third server, and so on.
This method is the most appropriate one for predictable user request streams that are spread across a server farm where members have approximately equal processing capabilities and available network bandwidth and storage.
This is similar to the round-robin method but adds the ability to spread the network load to the servers according to the relative capacity of every server. This method is used when diverting incoming user requests across many servers that have different capabilities/available resources. The administrator assigns specific weight to a server indicating their relative traffic-handling capability, according to which they are chosen.
If server #1 is twice as powerful as server #2 and server #3, the first one is provisioned with a higher weight and the other two are assigned the same, lower, weight. If there are 5 sequential client requests, the first two are directed to server #1, the third to server #2, the fourth to server #3 and the fifth would once again be diverted to server #1, and so on.
Also read: “Service Discovery in Microservices systems”
This is a dynamic load-balancing algorithm in which client requests are distributed to the server having the least number of active connections at the instant when the request is received. With servers having similar specifications, one server can become overloaded when the connections are longer lived because this method takes into consideration the active connection load. This is suited when incoming requests have varying connection times and the servers are relatively similar in terms of processing power and resources.
This method builds on the least connection method when the application server characteristics are different. The administrator assigns a weight to each server based on the relative processing power and resources of each server on the farm. The load-balancing decisions depend on active connections and the assigned server weights. As an example, with two servers having the lowest number of connections, the server having the highest weight is chosen.
This method makes decisions as per status indicators retrieved from the backend servers. The status indicator is determined by a customized program/agent that runs on each server. Each server is queried regularly for status information and the dynamic weight of the server is set appropriately.
Here, the load balancing method essentially performs a detailed health check on the server. This fits any situation where health information is required from each server to make decisions regarding load-balancing. This method is suitable for an application where the workload is not constant and detailed application performance determines the server's health.
SDN adaptive is a method that uses knowledge from Layers 2, 3, 4 and 7 and requires input from an SDN controller. This helps to make optimized traffic distribution decisions. Information about the server status, application (running on the servers) status, network infrastructure health, and congestion level, all play a part in the load-balancing decision. This is suited for deployments including an SDN controller.
Here, the administrator assigns a weight to a server according to its relative traffic-handling capability. The server with the highest weight will get all the traffic. If that server fails, all the traffic is directed to the server with the next highest weight. This method suits the type of workloads when a single server can handle all incoming requests, and when you have one or more spare servers that can handle the load if the active server fails.
In this method, the server’s response time is used to calculate a server's weight. The server that responds the fastest receives the next user request. This is good for scenarios where the application response time is very important.
This method uses the source and destination IP addresses of the user request to generate a unique hash key used to allocate the client to a specific server. The key is regenerated in case the session is broken. Here, the request is directed to the server it was using previously. This method is used when a client needs to return to the same server for every successive connection.
This method is similar to source IP hashing. Here, the hash depends on the URL in the client request. Here, client requests to a specific URL are always sent to the same backend server.
The choice of load balancing technique depends on your requirements.
At SayOne, our integrated teams of developers service our clients with web and mobile applications that are fully aligned with the future of the business or organization. We design, develop and implement applications using Agile and DevOps methodologies. Our system model focuses on programs that are resilient, fortified, and highly reliable.
We're committed to your privacy. SayOne uses the information you provide to us to contact you about our relevant content, products, and services. check out our privacy policy.
About Author
Subject Matter Expert
We collaborate with visionary leaders on projects that focus on quality and require the expertise of a highly-skilled and experienced team.