Q: Eric made the comment that it's a negative that the current ffmpeg integration wastes disk space by writing segments to disk. I wanted to open a conversation about this.
- Is it really a negative? Do we need segments to be temporarily remembered prior to the claim/verify loop in our protocol because we need to provide evidence of them later in the verify() transaction? I suppose we could instead just write metadata to disk, such as the hashes/signature that encompass the transcode receipt https://github.com/livepeer/wiki/blob/a117e9590efe4e7c9d986b2579c5ca02b11c54da/SPEC.md#transcode-receipt
- Also, we have the out of memory issue on the livepeer node because we're currently storing references to the data segments themselves within data structures, and never freeing these references. livepeer/go-livepeer#227. Should we instead be just storing references to the segments in memory, and writing the segments to disk? Or use some sliding buffer which caches a few segments in memory for serving, but writes older ones to disk?
- Keep in mind that even if we don't need the segments on disk for the protocol, we may need them all to be written to disk for saving the video for VOD use cases after the stream is completed.
Response
In-memory is faster, on-disk makes additional tooling simpler.
So whether to write segments to disk is actually the tip of a really meaty issue. This touches a bit on my previous emails with Eric about verifying the content that the broadcaster itself is sending out (eg, in the case of a broadcaster griefing a transcoder or verifier). I may be making a number of assumptions here, so please correct any (mis-)understanding. Been reading the whitepaper and spec, but haven't actually dived into any of the contracts yet to see what the code is doing in practice.
Transcoders and verifiers need to get the segments from somewhere. Is the broadcaster node the best place for them to fetch that information? Or should the broadcaster instead seed a relay node with the original content? Broadcasters/transcoders could optionally act as a relay, which would be logically equivalent to seeding a remote relay.
Some reasons for seeding a relay node:
-
There may be dozens of transcoders (different combinations of bitrate, resolution, codec, etc quickly enlarge the output space) and at least as many verifiers, probably a multiple of 2-3 at a minimum [1].
-
Physical network capacity is the one resource that is both difficult to control, and critical to maintaining a good streaming experience. Unlike transcoders which can allocate compute capacity deterministically, nodes are much more prone to behaving non-optimally (even if non-maliciously) when trying to push time-sensitive media onto the network.
-
Broadcasters and transcoders may have limited upstream bandwidth, or otherwise prefer to dedicate resources towards preserving the quality of the output stream, rather than serving ancillary requests, eg from verifiers or additional transcoders [2].
-
Transcoders and verifiers probably should be pulling from the same (independent and replicated) metadata sources rather than trusting the broadcaster alone to supply the metadata. Presumably this is the blockchain.
-
VOD review of the original stream even when the original broadcaster is offline. Unlike transcoders or relays, broadcasters aren't expected to be available on the network continuously.
All that being said, my inclination is to send new segments to a relay node immediately and have the Livepeer network pick it up from there. So the segments don't need to be on disk.
However, it doesn't hurt to offer the option of a local copy of the stream, especially for the broadcaster. Lots of non-protocol reasons that would be desirable -- ease of tooling/post-processing, archival, failure recovery, regulatory, etc. This needs to have a number of knobs to tune -- we don't want 24/7 streams to fill up a broadcaster's disk, for example.
To address some specific points:
Do we need segments to be temporarily remembered prior to the claim/verify loop in our protocol because we need to provide evidence of them later in the verify() transaction? I suppose we could instead just write metadata to disk, such as the hashes/signature that encompass the transcode receipt
If the 'disk' is the blockchain, then the sender only needs to retain the metadata for as long as it is unconfirmed. There is the question of how to ensure sufficient replication of the segments themselves, and how to incentivize hosting low-traffic streams. Verifying a peer's possession of a segment using a random oracle could work. Past that point, the sender doesn't absolutely need to be concerned with storing the segments locally, whether in memory or on disk.
Also, we have the out of memory issue on the livepeer node because we're currently storing references to the data segments themselves within data structures, and never freeing these references. livepeer/go-livepeer#227. Should we instead be just storing references to the segments in memory, and writing the segments to disk? Or use some sliding buffer which caches a few segments in memory for serving, but writes older ones to disk?
Once the original (non-transcoded) segments are on the Livepeer network along with the metadata, then the sender doesn't need to retain a reference to it. Again, this assumes that segments are sent to a relay immediately. This should also make keeping the media content in memory affordable. Metadata is also cheap to store, especially if it only needs to be stored until confirmed on the blockchain, and can be re-computed from saved segments. Thereafter it can be looked up anytime on the chain.
However, writing to disk may make tooling simpler. This allows more flexibility in the pipeline; eg the uploader can be shared between the broadcaster (segmenter) and the transcoder. This is faster in-memory with a good API, but possibly more work. Speed is maybe moot anyway when segmented streams inherently have latencies from seconds to minutes. Have to think more about how it'd be implemented before making a definite recommendation one way or the other.
I need to get some other work done here right now, but will draft a spec today for the libav/ffmpeg and LPMS integration.
[1] This is another issue of economic incentives related to having dozens of transcoders. A viewer may be better served by a certain transcoding combination that the broadcaster doesn't (currently) offer. In a sense, additional transcoding combinations are tax on the broadcaster. This is just another economic tradeoff for the broadcaster to make -- how wide of an audience to reach, versus what they are willing to spend. Some ideas: a feedback or voting/request mechanism for a new transcoding combination (to take effect in the next round), which the broadcaster can authorize (or de-authorize) based on its calculation of the marginal cost of each additional viewer that combination would attract. Of course, there's lots of nuance here, so that's an entirely different topic for later.
[2] This brings up an unrelated attack where relay nodes are actually dis-incentivized to share a honest list of peers. This is even more of a problem if the broadcaster and relay are colluding (or are the same node). Had some thoughts about looking at reputation as an ensemble of the nodes you're peered with.