A survey of techniques for scaling busyish websites for non-experts.
- Shane Hansen (@shanemhansen)
- Walmartlabs Edge Engineering
- Performance connoisseur
- Engineers need to understand options to make good choices.
- Load balancing is rapidly changing and adapting to evolving protocols
- It’s interesting
- Distributing requests over multiple endpoints to increase the reliability or performance of a service.
- Inline load balancers filter and transform traffic, so they often quickly grow features around
- observability
- routing
- OOB load balancers either make a decision and leave the request flow
- Some companies do these things
- Nobody would learn anything if I just said: “use dyn” “use cloudflare” or “use elb”
- This is utah go, not utah people-who-write-checks-to-vendors
- Many links to go libraries will be left until the end
Making a website scale, step by step.
Let’s say one day that you build a website. What if it starts making money or getting popular?
digraph G {
Browser->WebServer [label=" <-- SPOF"];
}
digraph G {
Browser->Nginx [label="<--SPOF"];
Nginx->WebServer1;
Nginx->Webserver2;
WebServer1;
}
If your web app is truly stateless, round robin is a solid choice that’s hard to mess up.
If it’s not stateless, make it stateless.
Two major types: inline and oob
- oob: app specific health endpoint hit on an interval
- Inline: utilize observed info from real requests to select backends
Benefits and drawbacks to both.
- Typically HTTP request to endpoint with 2xx status code
- Extended validation of body supported
- Endpoint can return a health “number” for advanced algos
- Timeouts, configured retry policies
Pros:
- Rolling app upgrades possible
- Capacity can be easily increased
Cons:
- Limited by performance of software load balancer. ballpark $X0,000 requests/s
- Your site is still bottlenecked by uptime of a single vm.
What to do when your company is relying on a single nginx instance.
Pros:
- DNS infrastructure is quite scalable
- Possible to scale out and upgrade load balancers
Cons:
- DNS resolvers are quirky, hard to guarantee distribution
- Broken software with bad caches
- Quirky protocol: hard to do priorities, weighting, etc.
digraph G {
browser->DNS [label="recursive resolver"];
browser->nginxa;
nginxa->DNS;
nginxb->DNS;
nginxa->appa;
nginxa->appb;
nginxb->appa;
nginxb->appb;
}
Strategies
- Can be implemented in hardware
- Can be implemented in kernel
- Order of magnitude more throughput in appliances that do so
- Can be implemented by pushing state/routing to the edge
digraph G {
router1 [label="> 10GBPS router"];
router2 [label="> 10GBPS router"];
router1->lb1;
router1->lb2;
router2->lb1;
router2->lb2 [label="H(src-addr,dst-addr,proto)%N"];
}
- Extremely fast, can be implemented in hardware
- persistence: flow always goes to same node
- Avoids hot-spots due to “natted university effect”
- Implementation of n-tuple hashing
- Provide multiple routes to destination.
- Deterministic route selection
- Optimize by only using lb to select backend, not to proxy.
digraph G {
browser->lb1;
lb1->app1;
app1->browser;
}
- Examples so far mostly spray traffic everywhere
- How to build truly global systems
- Border Gateway Protocol
- Allows autonomous systems to advertise ip space
digraph G {
Browser->routera [label="send packet to x.x.x.x"];
routerb->routera [label="I can route packets for x.x.x.x"];
routera->routerb [label="here's a packet"];
subgraph cluster_0 {
label="POP";
bgpd->routerb [label="I can route packets for x.x.x.x"];
routerb->server [label="here's a packet"];
}
}
- Allow multiple pops to advertise same ip space
- Routers choose “best route” where best is a function of:
- network hops
- whether or not packet will leave ISP’s network
- stuff
digraph G {
subgraph cluster_0 {
label="POPA"
bgpda;
appa;
}
subgraph cluster_1 {
label="POPB"
bgpdb;
appb;
}
bgpda->EdgeRouterA [label="gimme x.x.x.0/24"];
bgpdb->EdgeRouterB [label="gimme x.x.x.0/24"];
EdgeRouterA->appa [label="Send to x.x.x.x"];
Browser->RouterC [label="Send to x.x.x.x"];
RouterC->EdgeRouterA [label="Send to x.x.x.x"];
RouterC->RouterD;
RouterD->EdgeRouterB;
EdgeRouterA->RouterC;
}
- Attempt to direct clients to closest physical server
- May not follow network topology
- Relies on datasets of questionable accuracy
- First hop latency unless anycasting DNS
- Benefit: DNS is safest proto to anycast
- Good idea if you want unicast TCP for long lived connections
Example setup for multi-pop anycast
digraph G {
label="Globally available services";
Browser;
DNS;
subgraph cluster_0 {
label="POPA";
bgpa [label="BGPD"];
popaicircuit1 [label="Internet link 1"];
popaicircuit2 [label="Internet link 2"];
iploadbalancera1 [label="IP load balancer 1"];
iploadbalancera2 [label="IP load balancer 2"];
popaicircuit1->iploadbalancera1;
popaicircuit1->iploadbalancera2;
popaicircuit2->iploadbalancera1;
popaicircuit2->iploadbalancera2;
slba1 [label="SLB"];
slba2 [label="SLB"];
slba3 [label="SLB"];
slba4 [label="SLB"];
iploadbalancera1->slba1;
iploadbalancera1->slba2;
iploadbalancera1->slba3;
iploadbalancera1->slba4;
iploadbalancera2->slba1;
iploadbalancera2->slba2;
iploadbalancera2->slba3;
iploadbalancera2->slba4;
slba1->app1;
slba1->app2;
slba1->app3;
slba1->app4;
slba2->app1;
slba2->app2;
slba2->app3;
slba2->app4;
}
Browser->DNS [label="returns anycast"];
Browser->Router;
bgpa->Router;
Router->popaicircuit1;
}
- zero copy resource passing (technically one packet is copied)
- Approach used internally at Walmartlabs/cloudflare (based on gophercon talk)
- HTTP/1.1 and TLS/SNI
- Assumes shared kernel
title Direct Container Return
activate Browser
Browser -> LoadBalancer: connect()/accept()
activate LoadBalancer
LoadBalancer -> Container: sendmsg()/readmsg()
activate Container
deactivate LoadBalancer
Browser <- Container: direct
- Utilize Edge capabilities internally via sidecar
- service discovery
- yada yada
- What comes next?
- Protocol optimizations: upgrade to http2? QUIC?
- Eliminate moving parts
- Turn logical microservices into functions at runtime where possible
- Use REST concepts to facilitate pervasive caching.
- Make networked services the fallback, calling local functions is the optimization.
- Seesaw (LVS): https://github.com/google/seesaw
- metallb (BGP): https://github.com/google/metallb
- BGP: https://osrg.github.io/gobgp/
- DNS: https://github.com/miekg/dns
- GoBPF: https://github.com/iovisor/gobpf
- cilium https://github.com/cilium/cilium
- OSS repo for fd passing load balancer: TBD