A Step-by-Step Guide with Practical Examples
Web servers like Nginx handle thousands of concurrent connections. To avoid bottlenecks, you must understand:
- File Descriptors (FDs): How the kernel tracks open files/sockets.
- Resource Limits (
ulimit
/rlimit
): How the kernel enforces per-process limits. - Nginx Configuration: How to optimize settings like
worker_rlimit_nofile
andworker_connections
.
This guide ties these concepts together, using a reverse proxy to a Node.js app as an example.
A file descriptor (FD) is a unique integer the kernel assigns to a process when it opens a resource (file, socket, etc.).
- Client Connection:
- A user connects to Nginx on port 80.
- Nginx creates a socket FD (e.g., FD
4
) for this connection.
- Upstream Connection:
- Nginx proxies the request to a Node.js app on port 3000.
- It opens another socket FD (e.g., FD
5
) to communicate with Node.js.
- Static File:
- If Nginx serves a file (e.g.,
index.html
), it opens FD6
for the file.
- If Nginx serves a file (e.g.,
Total FDs per Request: 2 (client + upstream sockets) + files.
10,000 Concurrent Users: Nginx needs ~20,000 FDs (without keep-alive).
The kernel enforces per-process resource limits via rlimit
. Key limits:
RLIMIT_NOFILE
: Max open FDs.RLIMIT_NPROC
: Max processes.
When a process calls:
open("file.txt")
→ Kernel checks ifcurrent FDs < RLIMIT_NOFILE
.socket()
→ Same check.fork()
→ ChecksRLIMIT_NPROC
.
If Nginx workers hit RLIMIT_NOFILE
:
- New connections fail.
- Users see
502 Bad Gateway
or connection timeouts.
worker_rlimit_nofile
:
Sets the FD limit for Nginx workers (overrides systemulimit
).worker_processes auto; worker_rlimit_nofile 65535; # Critical for high traffic
worker_connections
:
Max concurrent connections per worker.events { worker_connections 10240; # Must be ≤ worker_rlimit_nofile }
- Each connection uses 1 FD (client socket).
- Additional FDs are needed for upstream sockets, files, etc.
- Example:
worker_connections = 10,000
→ 10k FDs for client sockets.- If
worker_rlimit_nofile = 10,000
, Nginx has no FDs left for upstream connections or files.
Solution: Set worker_rlimit_nofile
higher (e.g., 65,535
).
Set for the user running Nginx (e.g., nginx
):
nginx soft nofile 65535
nginx hard nofile 65535
# Max system-wide FDs
fs.file-max = 2097152
# Ephemeral ports for upstream connections (discussed below)
net.ipv4.ip_local_port_range = 1024 65535
Apply changes:
sudo sysctl -p
Override nginx.service
:
[Service]
LimitNOFILE=65535
Reload:
sudo systemctl daemon-reload
sudo systemctl restart nginx
- Max Connections:
worker_processes * worker_connections
.
Example: 4 workers × 10k connections = 40k concurrent connections. - FD Usage:
- Client sockets:
worker_connections
. - Upstream sockets:
worker_connections
(if proxying). - Files: Depends on static assets.
- Client sockets:
For 10k connections/worker with proxying:
worker_rlimit_nofile 20000; # 10k × 2 (client + upstream) + buffer
events {
worker_connections 10000;
}
-
Check Process Limits:
cat /proc/$(pgrep nginx | head -1)/limits | grep "Max open files"
-
Count Open FDs:
lsof -p <PID> | wc -l
-
Socket Statistics:
ss -s # Total TCP connections ss -tnp | grep nginx # Nginx sockets
-
Nginx Status:
Enablestub_status
innginx.conf
:ngx_http_stub_status: This module is not built by default, it should be enabled with the --with-http_stub_status_module configuration parameter.
server { location /nginx_status { stub_status; } }
Output:
Active connections: 243 server accepts handled requests: 12345 12345 56789
- What: Reuse a single TCP connection for multiple requests.
- Why: Reduces FD churn and latency.
- Nginx Settings:
http { keepalive_timeout 60s; # Time to keep connection open keepalive_requests 1000; # Max requests per connection upstream nodejs { server 127.0.0.1:3000; keepalive 100; # Reuse upstream connections } }
- Problem: Outbound connections to upstream servers use temporary ports.
- Default Range:
32768-60999
(28,231 ports). - Fix: Expand the range to avoid exhaustion:
net.ipv4.ip_local_port_range = 15000 65535 # In /etc/sysctl.conf
user nginx;
worker_processes auto;
worker_rlimit_nofile 65535;
events {
worker_connections 30000;
multi_accept on;
}
http {
keepalive_timeout 60s;
keepalive_requests 1000;
upstream nodejs {
server 127.0.0.1:3000;
keepalive 100;
}
server {
listen 80;
location / {
proxy_pass http://nodejs;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
}
check this for more details nginx optimization
- Set
worker_rlimit_nofile
innginx.conf
. - Adjust
worker_connections
inevents
block. - Increase system-wide FD limit (
fs.file-max
). - Expand ephemeral port range (
net.ipv4.ip_local_port_range
). - Enable keep-alive for clients and upstreams.
- Monitor with
ss
,lsof
, and Nginx logs.
NOTES:
- NGINX performance improves with more CPUs, but gains diminish beyond "16 cores".
- HTTPS performance benefits more from extra CPUs than HTTP due to encryption overhead.
- Modern CPUs (e.g., "Intel Xeon E5") support AES-NI, which accelerates encryption by "5-10x".
- Handles "~2.1M RPS" for HTTPS/1KB files vs. 1.3M for HTTP.
- Use ssl_async (Nginx Plus) for parallel SSL processing.
- Throughput scales well with request size but plateaus after "8 CPUs".
- Upgrade to 'OpenSSL 1.0.2' and 'ECC certificates'. It allows you to provide "2-3x performance gains" for SSL transactions.
check this for more details nginx optimization