At a high-level, we need to understand two points:
- gRPC is built on HTTP/2, and HTTP/2 is designed to have a single long-lived TCP connection (a sticky and persistent connection);
- To do gRPC load balancing, we need to shift from connection balancing to request balancing;
- Load Balancing gRPC services
- Official Doc: gRPC Load Balancing
- gRPC Load Balancing on Kubernetes without Tears (nice explanation on tradeoffs)
- Why load balancing gRPC is tricky? (very good article)
- Old-Official Doc: Load Balancing in gRPC
- Challenges of running gRPC services in production
- Microsoft Doc: Load balancing gRPC
- Load balancing gRPC service with Nginx
- gRPC load balancing — Service Meshes
- gRPC Load Balancing inside Kubernetes
- Proxyless gRPC load balancing in Kubernetes (using xDS API)
- gRPC Loadbalancing in GKE using Nginx Ingress Controller
- On gRPC Load Balancing
A little bit about HTTP/2
https://www.cncf.io/blog/2018/07/03/http-2-smarter-at-scale/
https://grpc.io/blog/grpc-on-http2/
https://www.lucidchart.com/techblog/2019/04/10/why-turning-on-http2-was-a-mistake/ - um pouco superficial sem muitos detalhes