Skip to content

Instantly share code, notes, and snippets.

@drewr
Last active April 9, 2026 18:08
Show Gist options
  • Select an option

  • Save drewr/7bc1dc7374869a06857b8d6e3d82b418 to your computer and use it in GitHub Desktop.

Select an option

Save drewr/7bc1dc7374869a06857b8d6e3d82b418 to your computer and use it in GitHub Desktop.
Datum Network Services Operator: Accessible vs Inaccessible Envoy Capabilities

Datum Network Services Operator: Envoy Capabilities

Accessible Capabilities

L7 HTTP/HTTPS Routing

  • Listeners on ports 80 (HTTP) and 443 (HTTPS) with TLS termination
  • Request matching — method, path (prefix/exact), headers, query params, scheme (up to 128 matches across 16 rules)
  • Request/response header modification — add, set, remove headers
  • URL rewriting — hostname and path rewriting (prefix/exact)
  • Redirects — scheme, hostname, port, path, and status code
  • Direct responses — e.g., 503 for offline connectors

Backend Configuration

  • HTTP/HTTPS backends via FQDN or IP address (IPv4/IPv6)
  • Backend TLS validation with hostname-based certificate verification
  • Automatic hostname rewriting for upstream endpoints
  • EndpointSlice-based backend discovery

Connector Tunneling (Envoy CONNECT)

  • HTTP CONNECT method tunneling to connector backends
  • Path-aware extended CONNECT routes (prefix/exact matching)
  • Tunnel metadata passthrough (address, endpoint ID)
  • Internal listener patching via EnvoyPatchPolicy JSONPatch

Domain & TLS Management

  • Automatic domain ownership verification
  • cert-manager integration for ACME certificate provisioning
  • DNS zone integration for Datum-managed DNS
  • Per-hostname status tracking (verified, DNS programmed, cert ready)

Conformance-tested Features

  • H2C and WebSocket backend protocols
  • Request timeout (via Gateway API, not HTTPProxy directly)

Streaming Support

HTTP Streaming (SSE, Chunked Responses)

Works implicitly through Envoy's default behavior — nothing in the operator blocks chunked transfer encoding or Server-Sent Events. Envoy streams response bodies by default. The practical constraint is timeout configuration, tunable via BackendTrafficPolicy:

Timeout Default Max
Request timeout 1h
Connection idle timeout 1h
Connection duration 1h
TCP connect timeout 10s

HTTP streaming works up to those limits. Users can tune timeouts within these caps but cannot exceed them. For truly long-lived streams, the 1h ceiling is a hard constraint.

WebSocket

Declared as supported in conformance features (SupportHTTPRouteBackendProtocolWebSocket), but the conformance test HTTPRouteBackendProtocolWebSocket is skipped with the note: "Need to find a way to intercept websocket dialing." This is a test infrastructure gap, not necessarily a functional one — Envoy Gateway handles WebSocket upgrades natively.

The BackendUpgrade conformance test is also skipped ("Retries not permitted"), since retry is forbidden in BackendTrafficPolicy. This suggests upgrade-based protocols may have untested edge cases around error handling.

H2C (HTTP/2 Cleartext to Backends)

Supported and conformance-tested (SupportHTTPRouteBackendProtocolH2C). This is relevant for gRPC, which typically uses H2C for internal service-to-service communication.

gRPC Streaming

Not supported at the API level. Since there's no GRPCRoute, there's no way to configure gRPC-aware streaming behavior. However, gRPC unary and streaming calls could transit through HTTPProxy with https:// backends if you accept the limitations:

  • No gRPC-specific matching (service/method)
  • No gRPC status code awareness (Envoy treats it as plain HTTP/2)
  • Subject to the 1h request timeout cap, which would kill long-lived gRPC streams
  • No gRPC health checking

No Explicit Streaming Configuration

The operator exposes zero streaming-specific knobs:

  • No stream_idle_timeout (Envoy's per-stream idle timeout)
  • No max stream duration separate from request timeout
  • No server-side streaming timeout distinct from unary timeout
  • No client-side streaming flow control
  • No per-route timeout overrides from the HTTPProxy CRD itself

All of these fall to Envoy defaults or the coarse BackendTrafficPolicy timeout settings. There is no way to say "this route is a streaming endpoint, give it different timeout behavior" without creating a separate BackendTrafficPolicy and targeting it.


Custom Envoy Filters

Custom Envoy filters are supported through a layered system of Envoy Gateway extension APIs — not arbitrary Envoy filter config. There are three mechanisms, each with specific guardrails. All follow the same architectural pattern:

  1. User creates the resource in their project control plane namespace
  2. Webhook validates it against Datum's guardrails
  3. The gateway resource replicator copies it to the downstream Envoy Gateway cluster
  4. Envoy Gateway applies it to the actual Envoy proxy

The customization surface is "Envoy Gateway's extension APIs, minus the dangerous bits" — not raw Envoy filter config.

1. HTTPRouteFilter (gateway.envoyproxy.io/v1alpha1)

Users can create HTTPRouteFilter resources and reference them from HTTPProxy rules via ExtensionRef filters. The operator validates them via webhook (inline body size capped at 1024 bytes for direct responses) and replicates them to downstream clusters.

The operator itself uses this mechanism internally to generate "connector offline" direct response filters (503 pages).

2. SecurityPolicy (Envoy Gateway)

Supported authentication/security filters, validated and replicated downstream:

Filter Status
API Key Auth Allowed (with limits on extract sources, credential refs)
CORS Allowed (with max field lengths)
Basic Auth Allowed
JWT validation Allowed (remote JWKS supported, claim-to-header mapping)
OIDC Allowed (with scope/resource limits, min refresh TTL)
Authorization rules Allowed (with max rules, client CIDR limits)
ExtAuth Forbidden"extAuth is not permitted"

3. BackendTrafficPolicy (Envoy Gateway)

Supported traffic shaping filters, validated and replicated downstream:

Filter Status
Load balancer Allowed (endpoint overrides forbidden)
TCP keepalive Allowed (min probes=9, min idle=5m)
Passive health check Allowed (active health checks forbidden)
Timeouts (TCP connect, HTTP idle, request) Allowed (with configurable upper bounds)
Connection buffer limits Allowed (with max limit)
DNS settings Allowed (must respect DNS TTL, min refresh rate)
HTTP/2 settings Allowed (max stream/connection window, max concurrent streams=1024)
Local rate limiting Allowed (global rate limits forbidden)
Response overrides Allowed (response redirects forbidden)
Retry Forbidden"retry settings are not permitted"
Fault injection Forbidden"fault injection is not permitted"

What's NOT Available for Custom Filters

  • EnvoyPatchPolicy — users cannot apply raw xDS patches (conformance tests explicitly skip it)
  • ExtAuth — external authorization service references blocked
  • Active health checks — blocked
  • Global rate limiting — blocked (local only)
  • Retry policies — blocked
  • Fault injection — blocked
  • Arbitrary Envoy HTTP filters — no mechanism to inject custom Lua, Wasm, or ext_proc filters

gRPC Support

Current Status: Not Supported

gRPC is explicitly unsupported. The conformance tests skip GRPCRouteBackendFQDNTest and GRPCRouteBackendIPTest with the comment "No GRPC support yet". The grpcs:// scheme is treated as an invalid endpoint in validation tests.

Why Existing HTTP Support Doesn't "Just Work"

gRPC runs over HTTP/2, and Envoy can proxy gRPC frames through a standard HTTP connection manager. At the Envoy data plane level, gRPC would work today. The gap is not in the data plane — it's in the routing API and operator logic:

  1. No gRPC-aware matching — Gateway API's GRPCRoute lets you match on service + method (e.g., mypackage.MyService/GetFoo). With HTTPRoute, you'd have to manually construct path matches like /mypackage.MyService/GetFoo, which works but is brittle and loses semantic meaning. There's no GRPCRouteMatch equivalent in HTTPProxy.

  2. No gRPC-specific filtersGRPCRoute supports RequestHeaderModifier, ResponseHeaderModifier, and RequestMirror filters scoped to gRPC semantics. Envoy's gRPC-specific filters (gRPC-JSON transcoding, gRPC-Web, gRPC stats) have no exposure path.

  3. Scheme validation blocks grpc:///grpcs:// — Endpoint validation in httpproxy_validation.go explicitly rejects anything that isn't http:// or https://. Users must use https:// and rely on HTTP/2 ALPN negotiation, which does work at the transport level.

  4. No appProtocol: grpc signal — The EndpointSlice appProtocol is set to either http or https based on scheme. There's no path to set kubernetes.io/h2c or a gRPC-specific protocol hint that would tell Envoy Gateway to configure the upstream cluster for gRPC specifically (e.g., gRPC health checking, gRPC status code handling).

What Accidentally Works Today

A user could proxy gRPC traffic through HTTPProxy by:

  • Setting the backend to https://grpc-server.example.com:443
  • Using path matches like PathPrefix: /mypackage.MyService/
  • Relying on Envoy's HTTP/2 ALPN negotiation

This would forward frames correctly but without gRPC-aware error handling (gRPC status codes vs HTTP status codes), gRPC health checking, or transcoding.

What Full gRPC Support Would Require

Work Item Effort Notes
Accept GRPCRoute in the resource replicator Low Add the GVK to gatewayEnvoyGVKs list, same as SecurityPolicy/BackendTrafficPolicy
Create a GRPCProxy CRD (or extend HTTPProxy) Medium Mirror what HTTPProxy does but emit GRPCRoute instead of HTTPRoute. Needs gRPC-specific match types (service/method) and filter types
Webhook validation for GRPCRoute Low Follow the existing pattern
Controller logic to compile GRPCProxy -> GRPCRoute + EndpointSlices Medium Largely mirrors httpproxy_controller.go but with gRPC semantics
Connector tunnel support for gRPC Medium-High The CONNECT tunneling logic in connector_routing_compiler.go is HTTP-oriented; gRPC over tunnels would need testing/adaptation
Conformance tests Low Un-skip GRPCRouteBackendFQDNTest and GRPCRouteBackendIPTest
gRPC-specific Envoy filters (transcoding, gRPC-Web) High Would need new extension API surface or EnvoyPatchPolicy exposure

Core routing (items 1-4) closely mirrors the existing HTTPProxy pipeline. The heaviest Envoy Gateway lifting is already done — Envoy Gateway supports GRPCRoute natively. The operator just needs to generate and replicate it.

The harder question is product surface: whether gRPC gets its own CRD (GRPCProxy) or becomes a mode of HTTPProxy, and how deep gRPC-specific features (transcoding, reflection, gRPC-Web bridging) should go.


Inaccessible Capabilities

Advanced Routing

  • No regex path matching (only prefix/exact)
  • No wildcard hostnames (explicitly unsupported)
  • No traffic mirroring/shadowing
  • No canary deployments via weighted backends

Security & Auth (beyond SecurityPolicy)

  • No mTLS (client certificate validation) — only server-side TLS termination

Observability

  • No access log format/destination/sampling configuration
  • No distributed tracing setup (OpenTelemetry, Jaeger, Zipkin)
  • No custom metrics or stats configuration
  • No per-route request/response logging

Protocol & Transport

  • No HTTP/3 (QUIC) support
  • No raw TCP proxy mode (L7 HTTP only)
  • No custom Envoy network filters
  • No listener filter chain configuration
  • No content compression (gzip/brotli)
  • No custom error pages

Hardcoded Constraints

Setting Value
Valid ports 80, 443 only
TLS mode Always Terminate
Backends per rule 1
Max rules 16
Max hostnames 16
Max matches 128 total

Summary

The operator provides a streamlined L7 HTTP/HTTPS routing abstraction through its HTTPProxy CRD, covering the common web service use case: hostname-based routing, header manipulation, URL rewriting, redirects, TLS termination, and connector tunneling.

Streaming works implicitly for HTTP (SSE, chunked responses) and WebSocket through Envoy's default behavior, bounded by configurable timeouts (max 1h). There are no streaming-specific knobs — no per-stream idle timeout, no stream duration controls, no way to differentiate streaming routes from request/response routes at the timeout level without a separate BackendTrafficPolicy.

Custom Envoy behavior is available through Envoy Gateway extension APIs (HTTPRouteFilter, SecurityPolicy, BackendTrafficPolicy), which are webhook-validated against Datum's guardrails and replicated to downstream clusters. This provides access to JWT auth, CORS, API key auth, OIDC, local rate limiting, load balancing, timeouts, and more — while blocking dangerous features like external auth, fault injection, retries, active health checks, and raw xDS patches.

gRPC is not yet supported at the API level, though the underlying Envoy data plane could handle it. Adding core gRPC routing would mirror the existing HTTPProxy pipeline and leverage Envoy Gateway's native GRPCRoute support — the main decisions are around CRD design and how deep gRPC-specific features should go.

Advanced Envoy features not covered by the extension APIs (arbitrary filters, Lua/Wasm, tracing, access logging, HTTP/3) are not exposed at all. The design trades Envoy's full configurability for a simpler, opinionated, and safer API surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment