rklaehn/wikipedia.md

Last active October 30, 2022 16:49

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/rklaehn/f229e3c23f42929db8be40f44d26da73.js"></script>
Save rklaehn/f229e3c23f42929db8be40f44d26da73 to your computer and use it in GitHub Desktop.

Download ZIP

Wikipedia scenario

Raw

wikipedia.md

Scenario

Moderate size dataset (300GB)
Too large to be stored entirely on end user hardware
Seeder is not fast enough to serve all clients
All users have small part of the dataset, but none have all
User on consumer hadware want to browse with low latency

This scenario is mostly about content discovery, but it is a hard scenario that the hypercore team had issues with.

I don't think content discovery and content retrieval can be completely separated while staying efficient. Ideally you want to use the same format for the answers of content discovery to ask for content. Hypercore is doing this with a compressed bitmap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

rklaehn/wikipedia.md

Select an option

No results found

Select an option

No results found

Scenario