gRPC and Load Balancing

Just documenting docs, articles, and discussion related to gRPC and load balancing.

https://github.com/grpc/grpc/blob/master/doc/load-balancing.md

Seems gRPC prefers thin client-side load balancing where a client gets a list of connected clients and a load balancing policy from a "load balancer" and then performs client-side load balancing based on the information. However, this could be useful for traditional load banaling approaches in clound deployments.

https://groups.google.com/forum/#!topic/grpc-io/8s7UHY_Q1po

gRPC "works" in AWS. That is, you can run gRPC services on EC2 nodes and have them connect to other nodes, and everything is fine. If you are using AWS for easy access to hardware then all is fine. What doesn't work is ELB (aka CLB), and ALBs. Neither of these support HTTP/2 (h2c) in a way that gRPC needs. ELBs work in TCP mode, but you give up useful health checking and the join-shortest-queue behaviour that makes normal HTTP mode ELBs good. It also means you may experience problems with how well balanced your cluster is since only individual client connections are balanced rather than individual requests to the backend. If a single client is generating a lot of requests, they will all go to the same backend rather than being balanced across your available instances. This also means that ECS doesn't really work properly since it only supports the use of ELB and ALB load balancers. If your requirements are not too demanding TCP mode ELBs do work, and you can definitely ship stuff that way. It's just not ideal and has some fairly major problems as your request rates and general system complexity increase

I use gRPC on AWS and it works great. However, I don't believe ALBs support trailers in the HTTP/2 spec, so that won't work. Something may have changed since the last time I looked, but don't count on an HTTP/2 ALB working. I believe it's HTTP/2 to clients of the ELB but HTTP/1.1 to your backend servers.

Alternatively use ELB/ALB at Layer-3 but put your own HTTP2 compliant proxy behind it (Envoy, nghttpx, Linkerd, Traefik, ...) I know Lyft does this in production with Envoy.

https://forums.aws.amazon.com/thread.jspa?messageID=749377

We're trying to get the Application Load Balancer cooperating with some ECS-hosted gRPC services. So far it's failing; poking at the server a bit, it looks like requests are coming from the load balancer as HTTP/1.1, while gRPC server is expecting HTTP/2. The info on the load balancer suggests it supports HTTP/2, but does that only apply to the client side?

Hi. Yes, the requests are sent from the load balancer to the targets as HTTP/1.1. For more information, see http://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-listeners.html#listener-configuration.

https://groups.google.com/forum/#!topic/grpc-io/rgJ7QyecPoY

We sort of have this situation, since we use Google App Engine, and its load balancer and URLFetch service only support HTTP/1.1. We used the PRPC implementation described here, which is a mapping of the simple unary gRPC requests to an HTTP/1.1 protocol: http://nodir.io/post/138899670556/prpc. We used the Go implementation from the Chrome tools repository, and wrote our own client and server, which were relatively simple but absolutely do not support all of gRPC's features. The "better" approach might be to look at the grpc-web work, and possibly just run the grpcwebproxy. See: https://github.com/improbable-eng/grpc-web I think that will also have the problem that if your clients aren't Go or Javascript, you will need to implement the protocol yourself.

We normally recommend using a proxy that supports HTTP/2 to the backend, like nghttpx and derivatives (Envoy, Istio). If that's not possible, then the solutions tend to involve something that looks like grpc-web. If the proxy you are already using supports HTTP/1.1 trailers, it should be possible to use nghttpx to up-convert back to HTTP/2, but I've not tried that out.

Microservices at Lyst

HTTP load-balancing on gRPC services

Using Envoy to Load Balance gRPC Traffic

nginx now supports gRPC

gRPC Load Balancing with Nginx

DNS Load Balancing in GRPC

gRPC Load Balancing using Kubernetes and Linkerd

Tyk.io supports gRPC

HAProxy now supports gRPC

gRPC + AWS: Some gotchas

On gRPC Load Balancing

gRPC Load Balancing on Kubernetes

gRPC Load Balancing inside Kubernetes

How To Create Load Balancer For GRPC On AWS

Learnings from gRPC on AWS

New – Application Load Balancer Support for End-to-End HTTP/2 and gRPC

Demo for enabling gRPC workloads with end to end HTTP/2 support

Why load balancing gRPC is tricky? - A blog post providing an overview of gRPC load balancing options.

gRPC Client-Side Load Balancing in Go

Load Balancing gRPC services

gRPC load balancing with grpc-go

I can confirm that it works with NLB, although I recommend to read this article about some hidden aspects of it:
https://hori-ryota.com/blog/failed-to-grpc-with-nlb/

I'd second this that it works and have verified it holds a long duration gRPC bidi stream in an app I'm building for hours and still going strong today in my testing. The key that is missing in this article is that the gRPC server side needs an enforcement policy that matches the client side settings for the keep alive params. In my test setup I have the keepalive time at 30 seconds on both server and client, server side has an enforcement policy with MinTime of 30 seconds and both sides have PermitWithoutStream set to true. Project I'm in is in golang and is hosted behind aws nlb. Without the enforcement policy setup on the gRPC server side, the connections would drop somewhere between 8 to 15 minutes like the Japanese article was stuck on.

The latest ALB support of gRPC is great, but for a probably more complex/niche scenario, such as dynamically doing mTLS end to end I'm at a loss how to achieve that through the new ALB gRPC features as that would terminate there, but the NLB approach should support doing that. I'm not even sure a setup with api gateway would achieve a dynamic mTLS scenario end to end like I'm in with this gRPC client/server setup.

BTW this is a good gist @bojand.

bojand/index.md

ericbrumfield commented Dec 10, 2020

Uh oh!

SHUFIL commented Aug 3, 2021

Uh oh!

comerford commented Oct 1, 2021

Uh oh!