Skip to content

Instantly share code, notes, and snippets.

@andrewthad
Created May 31, 2019 13:38
Show Gist options
  • Save andrewthad/c5957a8918b60434f087c60c6351cf19 to your computer and use it in GitHub Desktop.
Save andrewthad/c5957a8918b60434f087c60c6351cf19 to your computer and use it in GitHub Desktop.
Notes to self about pipes, stdout, and a faster logger
I'd like to write a library kind of like sockets except that it would be used for dealing with named/unnamed unix pipes instead. The interface would be really similar. One big question is "how does the whole blocking/nonblocking thing work"? It doesn't work at all for regular files in linux but it works great for sockets. If you hook a pipe (like stdin or stdout) up to a file descriptor, does it always show it being ready, or does it work correctly? Pipes provide some kind of buffering, so maybe it works, but maybe it doesn't. I don't know. On linux, libuv always uses blocking IO when dealing with stdout. This is documented at http://nodejs.org/dist/v0.10.26/docs/api/process.html#process_process_stdout, but I do not understand why it was done. The IO manager that GHC ships with base requires some weirdness to get file descriptors to work nicely. It sometimes uses blocking IO and sometimes nonblocking IO depending on the threaded/nonthreaded runtime and on whether or not the file descriptor was created by the IO manager.
It would be nice to have a logger that uses writev to directly feed chunks to the file descriptor. This could be done by building up a cons list where everything is either (Addr,Int) or ByteArray and then reversing them and blasting them out all at once with writev. It would be nice to be able to use the unsafe FFI for this. If the file descriptor is a pipe or a socket, I think this is possible since epoll works. If it's a regular file, too bad.
GHC's Handle type (and Handle__ as well) is rather complicated. It seems to deal with a lot of concerns: character encodings, buffering, newline conversions, tracking the openness of the file handle. If I want to build an efficient and reusable abstraction that does not have as many of these concerns (e.g. something that is just for dealing with arbitrary binary data on a pipe), I need to think about the ergonomics of piling the other things on top later in other libraries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment