Load Balancing Basics
If you’ve done much work in Operations, you’ve probably encountered a load balancer. This dedicated network device sits between clients and pools of servers, spreading the incoming traffic between them to achieve a greater scale than any one server could handle alone. Perhaps the most obvious use case is web servers. A popular web site might get many millions of hits every day. There’s no way that one server, even a very expensive one, could stand up to that. Instead, many inexpensive servers are placed behind the load balancer and the requests are spread evenly among them. In a well-written web application, any server can handle any request. So this process is transparent to the user. They simply browse your site as they normally would, with no hint that each page they view might be returned by a different server.
There are other benefits, too. Hardware fails, software has bugs, and human operators make mistakes. These are facts of life in Ops, but load balancers can help. If you “overbuild” your pool with extra servers, your service can survive losing several machines with no impact to the user. Likewise, you could take them down one at a time for security patching or upgrades. Or deploy a new build of your application to only 5% of your servers as a smoke test or “canary” for catastrophic failures before rolling it out site-wide.
If your app needs 5 web servers to handle your peak workload, and you have 6 in the pool, you have 1 server worth of headroom for failure. This is known as “N + 1” redundancy, and is the bare minimum you should strive for when supporting any production service. Whether you want even more spare capacity depends on the marginal cost of each additional server vs the expense of an outage. In the age of virtual machines, these extra boxes may be very cheap indeed.
There are many options available for load balancing, both hardware and software. On the hardware side, some popular (and often extremely expensive) names are F5 BIG-IP, Citrix NetScaler, and Coyote Point. In software, the best known is probably HAProxy, although nginx and Apache have some limited load balancing services, too. And if you’re a cloud native, Amazon’s Elastic Load Balancer (ELB) product is waiting for you.
Load Balancing Internal Services
Load balancing public services is important. However, there are likely many internal services that are equally crucial to your app’s uptime. These are sometimes overlooked. I certainly didn’t think of them as candidates for load balancing at first. But to your users, an outage is an outage. It doesn’t matter whether it was because of a failure on a public web server or an internal DNS server. They needed you, and you were down.
Some examples of services you might load balance are DNS, SMTP for email, ElasticSearch queries and database reads. These might be able to run on a single machine from a sheer horsepower perspective, but load balancing them still gives you the advantages of redundancy to guard against failure and allow for maintenance.
You might even apply these techniques to your company’s internal or enterprise IT systems. If employees need to authenticate against an LDAP directory to do their jobs, it would be wise to load balance several servers to ensure business doesn’t grind to a halt with one failed hard drive.
Load balancing is a powerful tool for improving the performance, resiliency and operability of your services. It’s used as a matter of course on public systems, but give a thought to what it can do for your lower-profile ones, too.
That’s not to say that it’s a cure-all. Some services just aren’t suited to it, such as database writes (without special software designed for multiple masters). Or batch jobs that pull their work from a central queue. Other applications might not be “stateless” and misbehave if the user is routed to a different server on each request. As always, use the right tool for the job!