- Naming
- L1: L1 machine
- l1: Individual l1 worker process
- The overview graph lists all involved components. In future graphs irrelevant components are omitted for readability, which doesn't mean they have been removed
-
L2->l1
routing -
l1->L2
routing
flowchart TB
subgraph L1
master
subgraph workers
l1
l1'
end
end
L2
L2'
subgraph L2s
L2
L2'
end
L2s --> |register|workers
workers --> |request CID|L2s
L2s --> |send CAR|workers
client --> |request CID|workers
workers --> |send CAR|client
sequenceDiagram
participant Client
participant l1s
participant L2s
L2s->>l1s: Register
Client->>l1s: Request CID
l1s->>L2s: Request CID
L2s->>l1s: Send CAR
l1s->>Client: Send CAR
The l1 worker that receives the client request needs to be the one that also receives the CAR file from the L2. If L2s make new HTTP requests for their response, then the Node.js cluster module assigns it to a random l1 worker process.
We run into this problem because L2s likely living in home networks aren't directly reachable by the outside / can't act as servers. Therefore the L2 is the only one that can open new connections to the L1. This means that the L1 can't act as an HTTP client for example, requesting data from the L2. The L1 can request data, but unless we figure out how the L2 can reply directly, it will hit the L1's load balancer and get assigned a random l1, which didn't receive the original HTTP request from the client.
flowchart TB
subgraph L1
l1
l1'
end
subgraph L2s
L2
end
L2 --> |register|l1
l1 --> |request CID|L2
L2 --> |send CAR|l1'
client --> |request CID|l1
l1 --> |send CAR|client
linkStyle 2 stroke:red
linkStyle 4 stroke:red
Workers get their own port, so that L2 can reach the right worker (not behind load balancer).
flowchart TB
subgraph L1
l1
l1'
end
subgraph L2s
L2
end
L2 --> |register|l1
l1 --> |request CID with port|L2
L2 --> |send CAR to worker port|l1
client --> |request CID|l1
l1 --> |send CAR|client
linkStyle 2 stroke:yellow
linkStyle 1 stroke:yellow
linkStyle 4 stroke:green
Make nginx use sticky routing
flowchart TB
subgraph L1
l1
l1'
end
subgraph L2s
L2
end
L2 -->|register|l1
l1 --> |request CID|L2
L2 --> |send CAR|l1
client --> |request CID|l1
l1 --> |send CAR|client
hash
routing strategy then we can match L1s and L2s by CID URL path segment.If there is a single TCP connection between l1 and L2, then there's no load balancing between the two and the l1 that asks L2 is also the one that gets the response.
If we replace the round robin routing in the Node.js cluster module with our own logic, we can ensure that a L2 response always arrives at the right l1, by inspecting the request URL.
Use binary WebSockets, which allow for a persistent connection between L2 and l1, bypassing any load balancing.
0{L2ID}
0
+ arbitrary length l2 id (utf-8)
{requestId[36]}{cid}
36 bytes request Id (utf-8) + cid (utf-8)
1{requestId[36]}{chunk?}
1
+ 36 bytes request Id (utf-8) + chunk?
If the chunk is empty that means the CAR file has been served completely.
Unfortunately WebSockets are significantly slower than raw HTTP:
$ time node bench.mjs ws
node bench.mjs ws 5.96s user 0.66s system 104% cpu 6.329 total
$ time node bench.mjs http
node bench.mjs http 0.33s user 0.47s system 64% cpu 1.251 total
import WebSocket, { WebSocketServer } from 'ws'
import http from 'node:http'
const chunk1MB = Buffer.alloc(1024 * 1024).fill(1)
if (process.argv[2] === 'ws') {
const wss = new WebSocketServer({ port: 3000 })
wss.on('connection', ws => {
ws.on('message', d => {
if (d.length === 0) {
process.exit()
}
})
})
const ws = new WebSocket('ws://127.0.0.1:3000')
ws.on('open', () => {
let todo = 1024
const send = () => {
if (--todo > 0) {
ws.send(chunk1MB)
setTimeout(send, 0)
} else {
ws.send(Buffer.alloc(0))
}
}
send()
})
} else {
const server = http.createServer((req, res) => {
let todo = 1024
const send = () => {
if (--todo > 0) {
res.write(chunk1MB)
setTimeout(send, 0)
} else {
res.end()
}
}
send()
})
server.listen(3000)
const req = http.get('http://127.0.0.1:3000', res => {
res.on('data', () => {})
res.on('end', () => {
process.exit()
})
})
}
For reference, my attempt at a reverse http implementation (ptth):
import http from 'http'
import stream from 'stream'
const server = http.createServer((req, res) => {
console.log('l1 received request')
console.log('l1 tries turning into the client now')
console.log('l1 request cid')
req.socket.on('data', d => console.log('l1 receive', d.toString()))
const cidReq = http.request({
createConnection: () => req.socket,
// createConnection: () => stream.Duplex.from({ writable: res, readable: req }),
path: '/cid/CID'
}, cidRes => {
console.log('l1 received cid response')
})
cidReq.end()
})
server.listen(3000, () => {
console.log('l1 make request')
const req = http.request('http://localhost:3000', {
method: 'POST'
}, res => {
console.log('l2 received response')
})
req.on('connect', () => {
console.log('l2 established connection')
console.log('l2 tries turning into the server now')
})
req.end()
})
$ node http.mjs
l1 make request
l1 received request
l1 tries turning into the client now
l1 request cid
node:events:505
throw er; // Unhandled 'error' event
^
Error: Parse Error: Expected HTTP/
at Socket.socketOnData (node:_http_client:494:22)
at Socket.emit (node:events:527:28)
at addChunk (node:internal/streams/readable:315:12)
at readableAddChunk (node:internal/streams/readable:289:9)
at Socket.Readable.push (node:internal/streams/readable:228:10)
at TCP.onStreamRead (node:internal/stream_base_commons:190:23)
Emitted 'error' event on ClientRequest instance at:
at Socket.socketOnData (node:_http_client:503:9)
at Socket.emit (node:events:527:28)
[... lines matching original stack trace ...]
at TCP.onStreamRead (node:internal/stream_base_commons:190:23) {
bytesParsed: 0,
code: 'HPE_INVALID_CONSTANT',
reason: 'Expected HTTP/',
rawPacket: Buffer(64) [Uint8Array] [
71, 69, 84, 32, 47, 99, 105, 100, 47, 67, 73, 68,
32, 72, 84, 84, 80, 47, 49, 46, 49, 13, 10, 72,
111, 115, 116, 58, 32, 108, 111, 99, 97, 108, 104, 111,
115, 116, 58, 56, 48, 13, 10, 67, 111, 110, 110, 101,
99, 116, 105, 111, 110, 58, 32, 99, 108, 111, 115, 101,
13, 10, 13, 10
]
}
Suggestions by Miro:
Regarding Reverse HTTP code snippet - I think the client (L2) needs to start with an HTTP/1.1 Upgrade request. The server (L2) responds with HTTP status 101 to confirm the upgrade. Only after this initial exchange you are in control of the TCP connection and can start setting up a reverse HTTP.
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Protocol_upgrade_mechanism
- Protocol upgrade spec: https://datatracker.ietf.org/doc/html/rfc7230#section-6.7
- Registry of protocol names: https://datatracker.ietf.org/doc/html/rfc7230#section-8.6 and https://www.iana.org/assignments/http-upgrade-tokens/http-upgrade-tokens.xhtml
- WebSocket’s use of protocol upgrade - for inspiration: https://datatracker.ietf.org/doc/html/rfc6455#section-1.2 Now the list of registered protocols for upgrade is very short (basically HTTP, HTTPS, websocket), it’s a question is whether ngix supports custom protocols.
Now that we have solved the load balancing aspect of L2->l1 routing, another problem arises:
Each L2 is connected to one l1. This means that l1s don't always have a working connection to the L2 that should be selected by the consistent hashing algorithm, and can't help themselves but request the CID from the wrong L2, which will then also hydrate itself.
flowchart TB
subgraph L1
l1
l1'
end
subgraph L2s
L2
L2'
end
L2 --> |register|l1
L2' --> |register|l1'
l1 --> |request CID|L2
client --> |request CID|l1
l1 --> |send CAR|client
L2 --> |send CAR|l1
linkStyle 2 stroke:red
linkStyle 4 stroke:red
linkStyle 5 stroke:red
Add a singular proxy process that sits in between all l1s and their connected L2s. This proxy can forward request, selecting the right L2s to contact, and also forward responses to the right l1.
flowchart TB
subgraph L1
l1
l1'
proxy
end
subgraph L2s
L2
L2'
end
L2 --> |register|proxy
L2' --> |register|proxy
l1 --> |request CID|proxy
proxy --> |request CID|L2
client --> |request CID|l1
l1 --> |send CAR|client
L2 --> |send CAR|proxy
proxy --> |send CAR|l1
Route L1/L2 requests by CID path segment (GET /ipfs/CID
& POST /data/CID
), this way connecting the right l1s with L2s. This requires us to dith the node cluster
approach and let nginx load balance between them.
Downside: l1s need to synchronize with each other, to all know of all connected L2s, and ask the with l1 to ask for a CID from the right L2.
flowchart TB
subgraph L1
l1
l1'
end
subgraph L2s
L2
L2'
end
L2 --> |register|l1
L2' --> |register|l1
l1 --> |request CID|L2
client --> |request CID|l1
l1 --> |send CAR|client
L2 --> |send CAR|l1
map $uri $myvar{
default $uri;
# pattern "Method1" + "/" + GUID + "/" + target parameter + "/" + HASH;
"~*/Method1/(.*)/(.*)/(.*)$" $2;
# pattern "Method2" + "/" + GUID + "/" + target parameter;
"~*/Method2/(.*)/(.*)$" $2;
}
upstream backend {
hash $myvar consistent;
server s1:80;
server s2:80;
}
If we switch from Node.js process per CPU core to a singular Node.js http process, then we don't need a routing layer for l1s at all.
This means incoming HTTP requests don't need to be routed to the correct l1, and also L2 responses don't need to be routed to the correct l1. This removes the need for persistent connections between l1s and L2s as well, so we can go back to the NLDJSON + HTTP POST protocol.
Downside: If we need to distribute workload across multiple CPU cores, we need to look into strategies like worker pools with shared array buffers, on top of which streaming data verification (and more work) is implemented. For now, this shouldn't be a concern though.
GET htts://L1_URL/register/L2_ID
{"requestId":"REQUEST_ID","cid":"CID"}\n
{"requestId":"REQUEST_ID","cid":"CID"}\n
{"requestId":"REQUEST_ID","cid":"CID"}\n
\n <- heartbeat
...
"requestId"
is created by the client connecting to the L1, passed down to the L2. It will be attached by L1 and L2 to log lines created as part of the request response cycle, for ease of debugging.
POST https://L1_URL/data/CID
flowchart TB
subgraph L1
l1
end
subgraph L2s
L2
L2'
end
L2 --> |register|l1
L2' --> |register|l1
l1 --> |request CID|L2
client --> |request CID|l1
l1 --> |send CAR|client
L2 --> |send CAR|l1
POST htts://L1_URL/register/L2_ID
> {"ok":true,"cpu":0.8,"memory":0.2}\n
< {"requestId":"REQUEST_ID","cid":"CID"}\n
< {"requestId":"REQUEST_ID","cid":"CID"}\n
< {"requestId":"REQUEST_ID","cid":"CID"}\n