To address the OOMKill issue you're experiencing with Envoy when handling large responses, you'll need to adjust Envoy's buffering settings and potentially leverage HTTP/2 flow control to implement a form of backpressure. Here's a step-by-step guide to configuring Envoy for minimal buffering and backpressure:
Envoy allows you to configure buffer limits to control the amount of data it buffers. You can set both the per-stream and the connection buffer limits.
In your Envoy configuration file, set the buffer limits as follows:
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 80
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_backend
http_filters:
- name: envoy.filters.http.router
clusters:
- name: service_backend
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_backend
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8080
http2_protocol_options:
max_concurrent_streams: 100
# Buffer limits
overload_manager:
buffer_factory_config:
per_stream_buffer_limit_bytes: 1048576 # 1 MiB
connection_buffer_limit_bytes: 2097152 # 2 MiB
HTTP/2 flow control can help manage the data flow between the client and Envoy. It allows the client to control how much data is sent by sending WINDOW_UPDATE frames. Ensure that HTTP/2 is enabled in your backend cluster configuration.
In the clusters
section of your Envoy configuration, add http2_protocol_options
:
clusters:
- name: service_backend
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_backend
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8080
http2_protocol_options:
max_concurrent_streams: 100
initial_stream_window_size: 65536 # 64 KiB
initial_connection_window_size: 1048576 # 1 MiB
Ensure that downstream HTTP/2 settings are also configured to handle flow control properly. This ensures that the client and Envoy can negotiate proper flow control windows.
In the http_connection_manager
filter configuration:
http_filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_backend
http_filters:
- name: envoy.filters.http.router
http2_protocol_options:
max_concurrent_streams: 100
initial_stream_window_size: 65536 # 64 KiB
initial_connection_window_size: 1048576 # 1 MiB
After making these changes, monitor the performance of your Envoy proxy and backends. You may need to adjust the buffer limits and HTTP/2 window sizes based on your specific workload and performance characteristics.
Ensure that your clients are capable of handling flow control and can properly react to WINDOW_UPDATE frames to avoid overwhelming Envoy.
By configuring these settings, you should be able to minimize buffering in Envoy and leverage HTTP/2 flow control to create a backpressure mechanism that helps prevent OOMKills when dealing with large responses.