This is where the actual retrieving of the data happens: https://github.com/ipfs/js-ipfs-unixfs-engine/blob/7992ad9860360da3f7a9fb0639e4a7a67746057f/src/exporter/file.js#L117
Where the Wantlist (the block that IPLD requested) is sent off: https://github.com/ipfs/js-ipfs-bitswap/blob/51f5ce08bad4876c9b709eba27faac533e9c00d4/src/want-manager/index.js#L125
Next step: find where the wantlist is received Decision Engine gets the Wantlist and then sends of the corresponding blocks Here it gets the blocks it should send from the blockstore: https://github.com/ipfs/js-ipfs-bitswap/blob/51f5ce08bad4876c9b709eba27faac533e9c00d4/src/decision-engine/index.js#L99
The real bottlenck is the message size: https://github.com/ipfs/js-ipfs-bitswap/blob/51f5ce08bad4876c9b709eba27faac533e9c00d4/src/decision-engine/index.js#L21 The bigger, the faster are things. So it is kind of the back and forth, but GraphSync won't help here.
Here we get the root of the file:
The item
of
https://github.com/ipfs/js-ipfs-unixfs-engine/blob/7992ad9860360da3f7a9fb0639e4a7a67746057f/src/exporter/resolve.js#L37
is the root node, which got speficied here:
https://github.com/ipfs/js-ipfs-unixfs-engine/blob/7992ad9860360da3f7a9fb0639e4a7a67746057f/src/exporter/index.js#L58
The request needs to block until all data of the subtree is retieved. Currently Bitswap requests one block and can block until its retrieval as there is a notification once that block arrived. This doesn't work with GraphSync as we don't know which blocks might arrive beforehand.
Questions:
- How do you know that the last item was retrieved?
- Possible solution: send a "done" message
- How does such a stream of data relate to pubsub?
At the moment Bitswap is first storing the retrieved data in its blockstore before it is processed any further.
Questions:
- Does this make sense for GraphSync as well?
In UnixFS you can retrieve files from a certain offset on. This can be optimized during graph traversal. For this traversal the size
field is checked/accumulated. This can easily expressed with code, but it's hard to do in a formal language (it will be code at the end).
Question:
- Should there be support for dynamic code-based traversals in GraphSync?
- (@vmx) I lean towards having pluggable "traversal modes" which will be used with transmitting a certain flag. They will be hard-coded for every implementation of GraphSync. Once we have IPLD M2 that can probably used instead.