Skip to content

Instantly share code, notes, and snippets.

@creationix
Last active December 30, 2020 03:58
Show Gist options
  • Save creationix/939d46128c69f33e505f54097723092f to your computer and use it in GitHub Desktop.
Save creationix/939d46128c69f33e505f54097723092f to your computer and use it in GitHub Desktop.
Simple RPC design

I've designed a lot of RPC protocols in my career. One pattern that's worked well basically goes as follows:

// Client calls: print('Hello World\n')
-> [1, "print", "Hello World!\n"]
// Server sends return value (or lack of return vvalue)
<- [-1]

// Client calls: add(1, 2)
-> [2, "add", 1, 2]
// Server responds with result
<- [-2, 3]

// Client calls a function that throws an error
-> [3, "crasher"]
// Server sends error using lua/go style multiple return values.
<- [-3, null, "There was a problem"]

// Client passes in a callback to be called later
// The function is serialized into something special that both sides agree is a function.
-> [4, "setInterval", {$:1}, 1000]
// server returns with function to cancel the interval
<- [-4, {$:2}]
// Then later the callback fires every second from Server
<- [0, 1] // zero request id mean we don't need a response
// Server...
<- [0, 1] // server is calling anonymous onInterval callback again
// Eventually client wants to cancel interval
-> [0, 2] // client calls anonymous clearInterval function

Or defined generally:

// Request
[request_id, target, ...arguments]
// Response
[-request_id, ...return_values]
// Event
[0, target, ...arguments]

Where request_id is a positive integer, target is a string or positive integer, and arguments and return_values are any serializable values.

So a few things to note:

  • Messages are always arrays. These serialize well and are compact in most formats. I tend to use either JSON or CBOR.
  • First value is a request id (positive integer), response id (negative integer), or 0 for message that doesn't need response.
  • second value if request or message is the function to call. It can be a named endpoint or the integer value of an anonymous function.
  • The rest of the values are arguments for request/message and return values for responses
  • the convention for errors is two return values null, message. This matches lua style error handling, but can be mapped to many other languages.

I tend to use cbor (used to use msgpack a lot) because it's compact and allows binary values to be passed through. Also cbor allows registering custom types, so serializing functions doesn't need the {$:id} format used in JSON.

This isn't perfect, but it works well for lots of use cases, is very fast and effecient, and is very easy to implement.

The namespace for request/response IDs is defined by the caller. Request 1 by one peer is not the same as request 1 by the other peer. This is why negative is used for responses.

In the same manner, anonymous callbacks are serialized using integers defined by the person who owns the function and sends the function. One interesting side effect is it's not possible to send a function back to it's owner because it would be interpreted as a new function owned by the sender.

@billiegoose
Copy link

@creationix What about lifetime management of callback functions? Ideally after the "setInterval" is cancelled you would be able to unregister the {$:1} function so it can be garbage collected. Has that ever been an issue?

@creationix
Copy link
Author

creationix commented Sep 12, 2019

@wmhilton. Yes, this is often a problem with serializing functions that can be called zero, one, or more times. Sometimes I manually manage the lifetimes in my APIs since I know when to expect more calls. Sometimes I don't worry about it since it's arena allocated to the TCP session and my connections might be fairly short lived. Sometimes I'll have explicit mini sessions inside the socket for finer grained arena allocation and cleanup.

@creationix
Copy link
Author

Also if the language has gc hooks and weak refs (such as Lua), it's possible to send a tiny message across when the proxy callback is GCed so the host knows to release its reference.

@billiegoose
Copy link

I'm curious (because I ran into this today) how you'd approach dealing with really large blobs. Say one of the arguments or return values is a gigabyte Uint8Array and because of frame size limits in your transport, you need to break the object into chunks or transport it via a second channel. Any wisdom on how to do that nicely?

@creationix
Copy link
Author

creationix commented Oct 21, 2019

@wmhilton, My assumption here is the underlying APIs that are being exposed over the network never send giant chunks. So it would be responsibility of the API designer to offer a streaming response or send it via a second channel.

For streaming, the receiver can pass in a callback function and the sender can repeatedly call that callback function with chunks, all before returning.

// Receiver request giant file and passing in function for chunks
-> [1, 'downloadBigThing', {$:2}]
// Sender sends chunks
<- [-2, firstChunk]
<- [-2, secondChunk]
<- ...
// Sender finally returns
<- [-1]

Though when working with such large files, I would hope it's broken up into content hashed chunks and can be swarmed bittorrent style over some P2P network (which could also use this protocol for swarming)

In that case, you could simply respond with the list of chunk hashes, root hash, seed peers, etc. Basically, send whatever helps the client fetch over the swarm.

@billiegoose
Copy link

Thanks! Breaking it into chunks and "streaming" the result via callbacks makes sense.

@ggoodman
Copy link

ggoodman commented Nov 25, 2019

I've taken the (awesome) ideas from here and attempted to encode them in a re-usable lib: https://github.com/ggoodman/rpc.

A notable difference is that I've introduced the idea of Codecs that encode certain types for passing across the Transport. Codecs are named interfaces that can encode into the wire format and decode from the wire format. Instead of {$: 1} representing anonymous callback #1, Codecs will encode this as {$: 'Function', id: 1} in the attached lib. The same mechanism is used for propagating errors.

Another notable design divergence I took is to say that anonymous functions will propagate their return value (or completion) back to callers. To accomplish this, I tweaked the protocol such that:

// Client passes in a callback to be called later
// The function is serialized into something special that both sides agree is a function.
-> [4, "setInterval", {$:1}, 1000]

// Then later the callback fires every second from Server
<- [0, 1, 1] // this is request 1 of the anonymous function 1
// The interval handler will be called locally and upon completion, a completion message will be sent
-> [0, -1] // The protocol diverges here by using `[0, -N]` as an anonymous function receipt

// The callback fires again
<- [0, 1, 2] // this is request 2 of the anonymous function 1
-> [0, -2]

Love how simple the protocol is that you've introduced @creationix. What I'm now wondering is whether I should introduce the concept of Peer identity for situations where you might want to multiplex Peers over a transport. 🤔

@creationix
Copy link
Author

creationix commented Dec 4, 2019

@ggoodman, as I mentioned above, I tend to use CBOR as the serilization in this RPC. In that format, there is already built-in the ability to register codecs at the serilization level so you don't need to encode it specially here. https://cbor.io/

Also CBOR is very compact and supports binary values unlike JSON.

Also, I'm not sure I made it clear, but there is nothing special about anonymous functions compared to named functions when you call them. You can call both as either event mode (don't care about there result) or function mode (you want to get the result/completion). The only difference is using a number or string to address the function.

See my first comment above for an example https://gist.github.com/creationix/939d46128c69f33e505f54097723092f#gistcomment-3022785

@ggoodman
Copy link

ggoodman commented Dec 4, 2019

@creationix, I missed the nuances about event-mode and function-mode and can now see how it fits easily into the simpler model. I think I'll make those changes in my lib :D

As for CBOR, TIL (and starred). Looks like a great encoding format for binary transports.

One area I struggle with, conceptually, is deeply nested objects, especially in JavaScript; do you traverse properties? What about prototypes and prototypical values and methods? What about function binding semantics for methods?

Another area of puzzlement is taking advantage of transport-specific features. For example, in browser workers and node's require('worker_threads') you have the option to pass 'transferrable objects'. The semantics of this functionality is specific to the transport. I have a tough time coming up with a clean transport / codec separation that will permit taking advantage of transferrable objects.

My current thinking is to use the concept of a Codec as an escape-hatch. Instead of traversing objects and trying to best-effort encode them, I'm testing only against top-level arguments. If a consumer of the lib wants to transfer an AbortSignal (or any other object having application-specific semantics), they would need to first register a codec for such objects. In the lib, I've shipped with built-in codecs for Error objects and Functions. The function codec is what marshals between the {$: 'function', id: N} encoding and registering anonymous functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment