On Tue Oct 27, 2015, history.state.gov began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:
2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files) 2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...
An article at http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/ provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.
-
- Instead of using
su
to runulimit
on the nginx account, useps aux | grep nginx
to locate nginx's process IDs. Then query each process's file handle limits usingcat /proc/pid/limits
(wherepid
is the process id retrieved fromps
). (Note:sudo
may be necessary on your system for thecat
command here, depending on your system.)
- Instead of using
- Added
fs.file-max = 70000
to /etc/sysctl.conf - Added
nginx soft nofile 10000
andnginx hard nofile 30000
to /etc/security/limits.conf - Ran
sysctl -p
- Added
worker_rlimit_nofile 30000;
to /etc/nginx/nginx.conf. -
- While the directions suggested that
nginx -s reload
was enough to get nginx to recognize the new settings, not all of nginx's processes received the new setting. Upon closer inspection of/proc/pid/limits
(see #1 above), the first worker process still had the original S1024/H4096 limit on file handles. Evennginx -s quit
didn't shut nginx down. The solution was to kill nginx with thekill pid
. After restarting nginx, all of the nginx-user owned processes had the new file limit of S10000/H30000 handles.
- While the directions suggested that