commnets for this proposal by @[email protected]
This was an interesting, although sometimes a bit tough to understand proposal. The order of sections and introductions could be improved.
While I found this different approach interesting, I have some issues and questions. Please excuse my misunderstandings, there are likely a bunch of them.
- For many usecases, posts have to disseminate near-realtime through the fediverse. Can this system provide that?
- this probably depends on the periodicity of publishing the nextBundle
- very likely still much slower than my DHT-based approach
- how long do new or rarely used hashtags to disseminate through the system?
- Do instances have to parse all hashtags of a bundle -- and maybe even store and forward them?
- event bundles of an instance contain all hashtags known by an instance + lists of their posts
- If instances store and forward all the hashtags they received: assuming a size of 1KiB per post ID (which is rather high), extrapolating the Twitter dump used in my paper yields 8.58 TiB of post IDs to be stored for a full year when storing post IDs for all hashtags. This would have to be done for each instance in our instanceMap!
- If instances just store and forward hashtags they're interested in: This very likely hurts the dissemination of unpopular hashtags and makes it hard for new hashtags (no interest at all in the beginning) to be even forwarded by anyone except the originating instance
If an instance is unable to store events going far enough back to maintain working federation, other instances will switch to pulling those events from someone else.
How to find out from whom to pull instead? Try out all other known instances in order of their hmetrics?
- If an intermediate instance has to drop stored items
- For being able to get a list of posts for a single hashtag, we have to download and combine multiple bundles from various instances -> inefficient. Also, when do we know that we've combined enough of these bundles to get the most complete view possible?
- this issue appears as we basically have no load balancing except "instances can drop history if it's too much for them"
- Malicious instances may lie about their metric to become the preferred instances to fetch from, thanks to a low
hmetric
- You said your ideas were roughly inspired by BGP. Unfortunately, BGP is broken as well:
- BGP being broken is just not that apparent as fewer parties have access to the respective routing points between ASes, and those have a Gentlemans Agreement. But BGP routing attacks are quite common, usually noticed when they fail and the complete Youtube traffic is routed to a single poor box of an Pakistanian ISP.
- There's no real basis for trusting even correctly signed bundles by known instances, as in an open system instances are discovered through boosts, mentions, or explicit user subscriptions
- only reason for trusting instances: when using an explicit allow list. But all of these instance of that list need to take the same approach as well => nation-stateification of the fediverse, which produces clustering and makes it hard for new instances to join at all
- one advantage of your proposal: It is harder to deny the existence of posts, as they can be gossiped via multiple paths. But it is still not possible to proof the completeness of received posts, and more entities can forge posts.
- interesting approach, might be feasible (although I'm sceptic about dissemination times)
- but from a security perspective no real advantages:
- DHT security might indeed be hard, but you might be even too afraid of it
- only advantage: multiple paths for receiving posts of a hashtag, but similar might be achievable with the DHT approach with the redundancy scheme and verifiable, mergeable post histories
Thank you for giving this post such deep consideration !
For those who want to review the original document, you can find it here: Fediverse Global Hashtags.
I'll answer your comments as they come:
import *
Let me know if you have any other questions or thoughts, also we could switch to another medium such as google docs which has comments if you would prefer it.
Thanks,
Caleb