- Moderate size dataset (300GB)
- Too large to be stored entirely on end user hardware
- Seeder is not fast enough to serve all clients
- All users have small part of the dataset, but none have all
- User on consumer hadware want to browse with low latency
This scenario is mostly about content discovery, but it is a hard scenario that the hypercore team had issues with.
I don't think content discovery and content retrieval can be completely separated while staying efficient. Ideally you want to use the same format for the answers of content discovery to ask for content. Hypercore is doing this with a compressed bitmap.