Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save BrahimAfa/70a4f0929538e3c967821b27a4edcd5f to your computer and use it in GitHub Desktop.
Save BrahimAfa/70a4f0929538e3c967821b27a4edcd5f to your computer and use it in GitHub Desktop.

Nginx Performance Tuning: Understanding ulimit, File Descriptors, and Kernel Limits

A Step-by-Step Guide with Practical Examples


1. Introduction

Web servers like Nginx handle thousands of concurrent connections. To avoid bottlenecks, you must understand:

  • File Descriptors (FDs): How the kernel tracks open files/sockets.
  • Resource Limits (ulimit/rlimit): How the kernel enforces per-process limits.
  • Nginx Configuration: How to optimize settings like worker_rlimit_nofile and worker_connections.

This guide ties these concepts together, using a reverse proxy to a Node.js app as an example.


2. What Are File Descriptors (FDs)?

A file descriptor (FD) is a unique integer the kernel assigns to a process when it opens a resource (file, socket, etc.).

Example: Reverse Proxy Workflow

  1. Client Connection:
    • A user connects to Nginx on port 80.
    • Nginx creates a socket FD (e.g., FD 4) for this connection.
  2. Upstream Connection:
    • Nginx proxies the request to a Node.js app on port 3000.
    • It opens another socket FD (e.g., FD 5) to communicate with Node.js.
  3. Static File:
    • If Nginx serves a file (e.g., index.html), it opens FD 6 for the file.

Total FDs per Request: 2 (client + upstream sockets) + files.
10,000 Concurrent Users: Nginx needs ~20,000 FDs (without keep-alive).


3. Kernel Resource Limits (ulimit/rlimit)

The kernel enforces per-process resource limits via rlimit. Key limits:

  • RLIMIT_NOFILE: Max open FDs.
  • RLIMIT_NPROC: Max processes.

How Limits Are Enforced

When a process calls:

  • open("file.txt") → Kernel checks if current FDs < RLIMIT_NOFILE.
  • socket() → Same check.
  • fork() → Checks RLIMIT_NPROC.

Example: Too many open files Error

If Nginx workers hit RLIMIT_NOFILE:

  • New connections fail.
  • Users see 502 Bad Gateway or connection timeouts.

4. Nginx Configuration

Key Settings

  1. worker_rlimit_nofile:
    Sets the FD limit for Nginx workers (overrides system ulimit).
    worker_processes auto;
    worker_rlimit_nofile 65535;  # Critical for high traffic
  2. worker_connections:
    Max concurrent connections per worker.
    events {
      worker_connections 10240;  # Must be ≤ worker_rlimit_nofile
    }

Why worker_connectionsworker_rlimit_nofile?

  • Each connection uses 1 FD (client socket).
  • Additional FDs are needed for upstream sockets, files, etc.
  • Example:
    • worker_connections = 10,000 → 10k FDs for client sockets.
    • If worker_rlimit_nofile = 10,000, Nginx has no FDs left for upstream connections or files.

Solution: Set worker_rlimit_nofile higher (e.g., 65,535).


5. System-Level Tuning

1. User Limits (/etc/security/limits.conf)

Set for the user running Nginx (e.g., nginx):

nginx soft nofile 65535
nginx hard nofile 65535

2. Kernel-Wide Limits (/etc/sysctl.conf)

# Max system-wide FDs
fs.file-max = 2097152

# Ephemeral ports for upstream connections (discussed below)
net.ipv4.ip_local_port_range = 1024 65535

Apply changes:

sudo sysctl -p

3. systemd Service Limits

Override nginx.service:

[Service]
LimitNOFILE=65535

Reload:

sudo systemctl daemon-reload
sudo systemctl restart nginx

6. Estimating Nginx Performance

Key Metrics

  • Max Connections: worker_processes * worker_connections.
    Example: 4 workers × 10k connections = 40k concurrent connections.
  • FD Usage:
    • Client sockets: worker_connections.
    • Upstream sockets: worker_connections (if proxying).
    • Files: Depends on static assets.

Rough Calculation

For 10k connections/worker with proxying:

worker_rlimit_nofile 20000;  # 10k × 2 (client + upstream) + buffer
events {
  worker_connections 10000;
}

7. Investigating Performance Issues

Toolbelt

  1. Check Process Limits:

    cat /proc/$(pgrep nginx | head -1)/limits | grep "Max open files"
  2. Count Open FDs:

    lsof -p <PID> | wc -l
  3. Socket Statistics:

    ss -s  # Total TCP connections
    ss -tnp | grep nginx  # Nginx sockets
  4. Nginx Status:
    Enable stub_status in nginx.conf:

    ngx_http_stub_status: This module is not built by default, it should be enabled with the --with-http_stub_status_module configuration parameter.

    server {
      location /nginx_status {
        stub_status;
      }
    }

    Output:

    Active connections: 243
    server accepts handled requests: 12345 12345 56789
    

8. Optimizing with Keep-Alive and Ephemeral Ports

1. Keep-Alive Connections

  • What: Reuse a single TCP connection for multiple requests.
  • Why: Reduces FD churn and latency.
  • Nginx Settings:
    http {
      keepalive_timeout 60s;    # Time to keep connection open
      keepalive_requests 1000;  # Max requests per connection
      upstream nodejs {
        server 127.0.0.1:3000;
        keepalive 100;          # Reuse upstream connections
      }
    }

2. Ephemeral Ports

  • Problem: Outbound connections to upstream servers use temporary ports.
  • Default Range: 32768-60999 (28,231 ports).
  • Fix: Expand the range to avoid exhaustion:
    net.ipv4.ip_local_port_range = 15000 65535  # In /etc/sysctl.conf

9. Final Configuration Example

user nginx;
worker_processes auto;
worker_rlimit_nofile 65535;

events {
  worker_connections 30000;
  multi_accept on;
}

http {
  keepalive_timeout 60s;
  keepalive_requests 1000;

  upstream nodejs {
    server 127.0.0.1:3000;
    keepalive 100;
  }

  server {
    listen 80;
    location / {
      proxy_pass http://nodejs;
      proxy_http_version 1.1;
      proxy_set_header Connection "";
    }
  }
}

check this for more details nginx optimization

10. Summary Checklist

  1. Set worker_rlimit_nofile in nginx.conf.
  2. Adjust worker_connections in events block.
  3. Increase system-wide FD limit (fs.file-max).
  4. Expand ephemeral port range (net.ipv4.ip_local_port_range).
  5. Enable keep-alive for clients and upstreams.
  6. Monitor with ss, lsof, and Nginx logs.

NOTES:

  • NGINX performance improves with more CPUs, but gains diminish beyond "16 cores".
  • HTTPS performance benefits more from extra CPUs than HTTP due to encryption overhead.
  • Modern CPUs (e.g., "Intel Xeon E5") support AES-NI, which accelerates encryption by "5-10x".
  • Handles "~2.1M RPS" for HTTPS/1KB files vs. 1.3M for HTTP.
  • Use ssl_async (Nginx Plus) for parallel SSL processing.
  • Throughput scales well with request size but plateaus after "8 CPUs".
  • Upgrade to 'OpenSSL 1.0.2' and 'ECC certificates'. It allows you to provide "2-3x performance gains" for SSL transactions.

check this for more details nginx optimization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment