At a high-level, we need to understand two points:
- gRPC is built on HTTP/2, and HTTP/2 is designed to have a single long-lived TCP connection (a sticky and persistent connection);
- To do gRPC load balancing, we need to shift from connection balancing to request balancing;