Requirement is for NGINX Plus to back off and stop sending new connections to an upstream node if the network utilization for that node exceeds a given threshhold.
Create a simple HTTP-accessible script that runs on each upstream node. Script returns 200 OK
(HTTP status code) if
node is not overloaded, and 503 Too Busy
if node is overloaded.
Use the script as the target for an NGINX Plus health check
Running scripts from NGINX alone is not possible, as NGINX does not provide CGI or a similar application platform. We don't want the complexity of installing php, python or any other app platform on our upstream servers, so we'll use a simple HTTP responder (loadtest.sh) written in bash and running from systemd. You can of course adapt/port the loadtest.sh script to php, python etc, according to what can be run on the upstream server.
The script runs off port 8099 (for example), and returns status accordingly:
curl -D - http://dev0:8099/
HTTP/1.0 200 OK
Content-Type: text/plain
Connection: close
HTTP Status: 200 OK
Transfer counter 13088764457 to 13088764457 bytes
Timer 1610462202357 to 1610462204365 ms
Bytes transferred = 0 bytes over time 2008 milliseconds
Bandwidth = 0 Mbits
"Failure" output, when current bandwidth exceeds limit:
curl -D - http://dev0:8099/
HTTP/1.0 503 Too Busy
Content-Type: text/plain
Connection: close
HTTP Status: 503 Too Busy
Transfer counter 13558896787 to 13814984187 bytes
Timer 1610462222625 to 1610462224639 ms
Bytes transferred = 256087400 bytes over time 2014 milliseconds
Bandwidth = 970 Mbits
Put loadtest.sh
somewhere appropriate, such as /usr/local/bin
. Make executable.
Edit loadtest.sh
to define the correct network interface to monitor, and to define bandwidth threshold.
Test loadtest.sh
by writing an HTTP request to STDIN:
printf "GET /\r\n\r\n" | /usr/local/bin/loadtest.sh
HTTP/1.0 200 OK
Content-Type: text/plain
Connection: close
HTTP Status: 200 OK
Transfer counter 14170650531 to 14170650531 bytes
Timer 1610462694117 to 1610462696125 ms
Bytes transferred = 0 bytes over time 2008 milliseconds
Bandwidth = 0 Mbits
Configure systemd to run this script in response to a connection to port 8099.
File /etc/systemd/system/loadtest.socket
:
[Unit]
Description=HTTP service for load testing health check
[Socket]
ListenStream=8099
Accept=yes
[Install]
WantedBy=sockets.target
File /etc/systemd/system/[email protected]
:
[Unit]
Description=Load Test HTTP health check script
[Service]
ExecStart=-/usr/local/bin/loadtest.sh
StandardInput=socket
User=nginx
Group=nginx
Start the new service and test with web client:
systemctl start loadtest.socket
systemctl status loadtest.socket
● loadtest.socket - HTTP service for load testing health check
Loaded: loaded (/etc/systemd/system/loadtest.socket; enabled; vendor preset: enabled)
Active: active (listening) since Tue 2021-01-12 14:50:42 UTC; 4s ago
Listen: [::]:8099 (Stream)
Accepted: 1; Connected: 0;
Tasks: 0 (limit: 4620)
Memory: 52.0K
CGroup: /system.slice/loadtest.socket
Jan 12 14:50:42 dev0 systemd[1]: Listening on HTTP service for load testing health check.
curl -D - localhost:8099
Check /var/log/syslog
for errors; for example, need to ensure that user:group nginx:nginx can access and run the script.
If you need to simulate high traffic, one approach is to scp a large file to /dev/null:
dd if=/dev/zero of=/tmp/1G bs=1M count=1024
scp /tmp/1G user@localhost:/dev/null
In this case, ensure that you monitor the localhost interface, using IF=lo
in loadtest.sh
Use the NGINX Plus dashboard to view the real-time status of the health checks you've configured.