Skip to content

Instantly share code, notes, and snippets.

@cshaa
Last active September 2, 2024 20:22
Show Gist options
  • Save cshaa/4546758c4105724d7279f9590b89ebb9 to your computer and use it in GitHub Desktop.
Save cshaa/4546758c4105724d7279f9590b89ebb9 to your computer and use it in GitHub Desktop.

Note

This document is an RFC (Request for Comments), and should be treated as a suggestion to the community, and not as an actual standard. If you have any feedback, ideas or implementations you'd like to share regarding Structured Pipes, please feel free to drop these in the comment section below.

Structured Unix Pipes

An inter-process protocol that allows processes to communicate using structured data, much like Nushell's internal commands can. In order for a structured pipe a | b to work, the protocol must be supported by command a, command b, and the shell. The protocol is designed in such a way, that if either of the processes doesn't support structured pipes, communication reverts to ordinary (unstructured) Unix pipes.

Structured pipes pass data in a machine-readable format. Before the data start flowing between the two programs, the shell process asks them which formats they understand, and selects the the most prefered format supported by both.

Specification

When setting up a structured pipeline, the individual processes go through several stages, starting with the SETUP stage.

1. SETUP

For each command in the pipeline, five file descriptors are to be created: stdin, stdout, stderr, ctlin, ctlout. The shell must keep a duplicate of each file descriptor for monitoring. The message \uffefStructuredPipe/0.1\n\n shall be written into ctlin of each command. For the first command in the pipeline, the stdin is filled with the input data, be it user input or a file.

Once the ctlin file descriptor of a command is ready to be read in a non-blocking manner, the process is to be dispatched: the shell process shall fork, then the child process shall duplicate the file descriptors stdin, stdout, stderr, ctlin, ctlout into 0, 1, 2, 3, 4, and finally it shall execute the process. With this, the process enters the HANDSHAKE stage.

2. HANDSHAKE

In this stage, the shell polls stdin, stdout, ctlin and ctlout. If the child process writes into stdout or reads from stdin during this stage, or if either ctlin or ctlout closes, the process enters into the FALLBACK stage.

If the process fails to write into ctlout until a timeout (the precise timing should be configurable in the shell, but a sane default is 250ms), it shall enter into the FALLBACK stage.

The ctlout is expected to provide output in the form \uffefStructuredPipe/0.1\nAccept: [IN]\nProvide: [OUT]\n\n, where [IN] is a comma-separated list of mime types that the process can accept in its stdin and [OUT] is a comma-separated list of mime types that the process can output into its stdout. If the shell fails to parse the output of ctlout, the whole pipeline shall fail. If the shell succeeds in parsing the output of ctlout, it saves the supported mime types and the process enters into the READY stage.

3. FALLBACK

The shell marks this process as only accepting text/plain and only providing text/plain, then closes its ctlin and ctlout file descriptors. Subsequently, the process enters the READY stage.

4. READY

5. ACTIVE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment