This only happens when the script is running inside a shell spawned by docker exec
. When running under bash or under docker run
, there is no defunct process.
This bug is first discovered when I run nvm in docker (creationix/nvm#650). It was then discovered that Docker's nsenter did not properly handle the SIGCHLD signal raised from zsh. (docker/libcontainer#369) Even though it is now been fixed by Docker, I am not sure why zsh would cause this bug in the first place.
Steps to reproduce:
# terminal 1
docker run -it --rm --name test-zsh ubuntu:latest /bin/bash
apt-get install zsh curl
curl https://gist.githubusercontent.com/soareschen/240e49116c7f2632d179/raw/0be67acefd8d18fd62bb181998996f9a5772dc64/docker-zsh-test.sh > docker-zsh-test.sh
chmod +x docker-zsh-test.sh
./docker-zsh-test.sh
ps auxf # no defunct process
# terminal 2
docker exec -it test-zsh /bin/zsh
./docker-zsh-test.sh
ps auxf # sed and zsh shown as defunct processes
I have talked to some people at Docker and zsh and find out what actually went wrong.
From my understanding, Linux has a feature called subreaper that can take over the role of init and become the parent of orphaned descendant processes. In this case on top of having different PID namespaces, docker exec
is spawning a process called nsenter to act as the subreaper. It is responsible to reap defunct processes by handling SIGCHLD. With it being the subreaper, all orphaned/defunct processes are captured by nsenter instead of propogating to the actual init process inside the container. So current Docker didn't handle SIGCHLD properly and therefore causing all orphaned processes to become defunct.
On the other hand when zsh is processing a script, it will optimize process handling and perform an implicit exec
of the final command in the script. So zsh need not hang around waiting for that process to exit, assuming that it's own parent will reap it. (zsh mla). In other words zsh is orphaning all its child processes after finish running the last command. With Docker's subreaper not handling orphaned processes properly, it just make this bug particularly obvious when running zsh scripts.