Here is an implementatoin proposal for an S3-based object store solution. The S3-based object store serves as an improvment to our current p2p relay-based video delivery approach.
Object Store in the broadcaster
- Implement object store broadcaster as
S3Broadcasterin the networking interface (instead ofBasicBroadcaster). - When broadcaster creates a job, it waits until a transcoder is assigned. At this time, the broadcaster asks the transcoder for an "object store location" via an http request.
- The broadcaster gets the write destination (S3 bucket and sub-directory) and credential from the transcoder.
- Need to do research in S3 permissioning
- S3Broadcaster writes the following data into the sub-directory
- segment ts file
- segment sig as {segment_name}.sig
- segment metadata that contains: SeqNo and Duration as {segment_name}.json
- playlist as {stream_id}.m3u8. For now, try just always updating the same file in S3, and see if the propagation works. (Not sure if S3 updates that quickly)
Object Store in the transcoder
- Implement object store subscriber as
S3Subscriber(instead of BasicSubscriber)- If a sub-directory with the StreamID exists in the S3 bucket and there is a playlist, assume the stream exists.
- S3Subscriber reads the ts file, sig file, and metadata file to recreate “core.SignedSegment”.
- S3Subscriber consumes the video similar to how an HLS video is consumed (check the playlist first, and then load the segments)
- S3Subscriber plugs nicely into VideoCache, just like BasicSubscriber (no change in workflow)
- When Transcoder get a job
- It waits for the broadcaster to send a http request with the StreamID.
- When it receives the req, it checks with its local storage to make sure the StreamID is correct
- If the StreamID is correct, it creates a sub-directory in its S3 bucket for the original stream
- It creates sub-directory for transcoded streams
- It sends the master playlist, and the connection information for the original stream sub-directory back to the broadcaster
- It starts consuming the HLS video from the stream and starts transcoding
- It makes sure all of the transcoded segments for a particular original segment are inserted at the same time
Object Store in the network
- When the stream is written into the object store, it is technically available to the public. However, the S3 bucket location is not well-known.
- Any node in the network can ask the network for the S3 bucket location of a specific StreamID. This can be a message that floods the network via a relay workflow similar to how it works right now. The request eventually gets to the broadcaster, and the broadcaster signs the message / sends it back. The recipient of the message will check the signature.
If I'm reading this correctly, this is an inherently pull-based workflow, where the transcoder is now polling the S3 manifest for new segments indefinitely. In addition to the tradeoff between latency and repeated requests (new segments won't always be available exactly at 4s intervals, but we don't want to wait too long either), the broadcaster has to do another S3 request to update the manifest in addition to the segment. We'd still have our dilemma with the current workflow: what's the best course of action if a broadcaster goes offline, or is otherwise sporadic, delayed, etc?
Here is a complementary proposal, which is less about object storage and more about the overall networking architecture between the broadcaster and the transcoder. This proposal retains the "push" based nature of the broadcaster-transcoder flow, and should alleviate most of these concerns: https://gist.github.com/j0sh/51c8555a2e85f5b4d97f10c13d218395