# fork
zmqc -w PUSH ADDRESS...
# merge
zmqc -w PULL ADDRESS...
zmqn COMMAND....
Start command on n subprocesses like:
merge A ipc | COMMAND | fork B ipc
Start a sink like:
merge B ipc
Start a source like:
fork A ipc
- Signals must be propagated to all subprocesses
- This only handles stdout. There may need to be another sink for stderr, which would ideally also indicate which subprocess caused the issue
- If a worker fails, then records can be lost in between. There must be a mechanism ensuring all records are processed (ex all workers must exit 0).
- Sources can split on arbitrary record separators (both the source for the workers and the source for the sink)
- The sink can print results in order. This is probably stupid. It would require the sink to have a polling order set per the source (ie source-sink communication). Processing would block on the slowest record.
- Allow workers to be created on other machines. This could be a separate command that just starts the source and sink to push and pull from specific places.
Perhaps stderr can be used for real good here. Have the source and sink indicate progress, for example.