When a process executes another process, either by one of the exec()
family of
functions, fork()
, etc., the child process inherits all of the parent's file
descriptors. This means if a parent has two file descriptors that are entangled
(i.e., a read end and a write end), the parent process can simply write to the
write end, and the child process can read from the read end.
ββββββββββββββββββ
β Process A β
β β
β fd1=1 β
β fd2=2 β
βββββ¬βββββββββββββ
β
ββββββββββββββββββ β
β Process B βββββββ
β β
β fd1=1 β
β fd2=2 β
ββββββββββββββββββ
In this example, A
can write to fd1
, and B
can read from fd2
,
establishing basic interprocess communication (IPC). But in large, long-running
production software, processes are often spun up dynamically and need to
communicate with other processes that they didn't inherit any I/O primitives
from. They need to learn about i.e., file descriptors connected to peer
processes on the fly.
ββββββββββββββββββ
β Process A β
β β
β fd1=1 fd3=3 β
β fd2=2 fd4=4 β
βββββ¬βββββββββ¬ββββ
β β
ββββββββββββββββββ β β ββββββββββββββββββ
β Process B βββββββ ββββββΊβ Process C β
β β β β
β fd1=1 β β fd1=1 β
β fd2=2 β β fd2=2 β
ββββββββββββββββββ ββββββββββββββββββ
In this example, C
needs to communicate with B
, but has no I/O primitives
connected to B
to facilitate this. But A
can teach C
about B
by sending
it a file descriptor that's connected to B
. Specifically, A
might create a
brand new pair of entangled file descriptors, and send each end to B
and C
respectively, establishing a direct line of communication between them.
ββββββββββββββββββ
β Process A β
β β
β fd1=1 fd3=3 β
β fd2=2 fd4=4 β
β β
β fd5=5βββββββΌββββββββββ
ββββββββββββΌβββββfd6=6 β β
β βββββ¬βββββββββ¬ββββ β
β β β β
βββββββββΌβββββββββ β β βββββββββΌβββββββββ
β Process B βββββββ ββββββΊβ Process C β
β β β β
β fd1=1 β β fd1=1 β
β fd2=2 β β fd2=2 β
ββββββββββββββββββ ββββββββββββββββββ
...and finally
ββββββββββββββββββ
β Process A β
β β
β fd1=1 fd3=3 β
β fd2=2 fd4=4 β
βββββ¬βββββββββ¬ββββ
ββββββββββββββββββ β β ββββββββββββββββββ
β Process B βββββββ ββββββΊβ Process C β
β β β β
β fd1=1 fd3=3 β β fd1=1 fd3=3 β
β fd2=2 β² β β fd2=2 β β
ββββββββββββββΌββββ ββββββββββββββΌββββ
βββββββββββββββββββββββββββββββββββββββ
But as you know, file descriptors are literally just int
s, so it's not enough
for A
to send C
simply the number 5
over a random message, which happens
to be the value of the A:fd5
descriptor. That value means nothing to the
child process. Integer file descriptors alone aren't powerful at all, they just
represent an underlying kernel resource, so we need to somehow tell the kernel
to send this resource, the underlying socket identified by its descriptor, to
another process.
We do this by sending a "control" message from e.g., A
to C
. Instead of just
sending normal bytes to C
, a "control" message contains information in a
format that the kernel understands, directing it send powerful capabilities in a
message, and recover them on the other end (in C
). Control messages must be
sent over Unix domain sockets via
sendmsg
and recovered via
recvmsg
.
All message data is encoded in the
msghdr
struct, including:
- The raw bytes of the message data (i.e., some text perhaps), in the
msghdr.io_vec
member - The control data in
msghdr.msg_control
The control data is called "ancillary data", and shouldn't be accessed directly
via the msghdr
struct. It should only be accessed by the
CMSG_FISRTHDR(struct* msghdr)
and CMSG_NXTHDR(...)
macros, which both return
pointers to cmsghdr
structs "inside" the outer msghdr
. The control cmsghdr
structs are where control data is written to and read from.
The cmsghdr
struct definition can be found
here. To send a file descriptor to another
process you need to do three things:
- Set
cmsg_level = SOL_SOCKET
- Set
cmsg_type = SCM_RIGHTS
- Set the actual data to the file descriptor value, via
*reinterpret_cast<int*>(CMSG_DATA(cmsg)) = fd_to_send
The CMSG_DATA()
macro is how you get a pointer to the actual data component of
the cmsghdr
where you write your file descriptor values to the message. The
level and type describe to the kernel exactly what data you're sending. A
cmsg_type
of SCM_RIGHTS
allows you to send "a set of open file
descriptors"; you can see the other message types
here1.
Both the msghdr.msg_controllen
and cmsghdr->cmsg_len
must be set
accordingly. See documentation in the code examples for more info.
The send_fd.cc
and receive_fd.cc
files in this collection represent two
binaries. The sender is the parent process which runs the child process, and in
this example, the flow of events is:
- Parent creates a UNIX domain socket pair
- Parent runs child binary
- Child binary runs with the inherited socket pair, acting as a primordial IPC connection
- Parent creates another socket pair that the child clearly won't inherit
- Parent sends a message over the original socket with:
- Some friendly text in
iovec
- Control ancillary data in
msg_control
, sending one end of the recently-created socket pair to the child
- Some friendly text in
- Child reads the message text from the parent, and recovers the new socket descriptor
- Child reads another simple message from the new socket that it just recovered
To run the example, do:
$ g++ send_fd.cc -std=c++0x -o send_fd
$ g++ receive_fd.cc -std=c++0x -o receive_fd
$ ./send_fd
This is a minimal toy example that simply demonstrates sending a file descriptor to another process. As a toy demo, it isn't that useful since the process receiving the descriptor already has a direct connection to the one sending it. A more interesting example would be a concrete implementation of the diagrams drawn above. Try and implement it yourself!