IPFS in 4 hours

I was given an opportunity to analyze the IPFS open source project, with the caveat that I should not spend no more than 4 hours, total, on this work. Of course, starting at only a basic understanding of the project to writing cogently on project patterns, strengths, and weaknesses in 4 hours is not a small request, and one that will be missing many pieces. But in the interest of sharing what I've learned and contributing back to the community, I am summarizing my thoughts here on GitHub. May this summary be helpful to someone else on their path to understanding.

Recommended background reading
Methodology
Strengths
Weaknesses and recommendations
Opportunity
Confession

Methodology

I read the conversations on the community discussion board from mid-December 2017 through mid-February 2018 and did a quick, lightweight qualitative analysis of topics by copying user feedback quotes into a spreadsheet, and noting topics covered, thread views, and thread likes in separate columns. This helped clarify the scope and popularity of various topics. Of course, in only 4 hours, I would not place high confidence in my findings, but they are a start!

In addition, I did a bit of quick desktop research to "get smart" quickly on the problem space. This included the background links suggested by the IPFS team, as well as the IPFS white paper; the IPFS PM repo; the project roadmap; and articles on some of the core technologies that, combined in intentional, systematic ways, are creating the IPFS.

Strengths

Community-driven

The community is clearly very excited about the future of IPFS and related projects. There is a strong following, and a seemingly helpful one at that. At least in the last 60 days' worth of discussion in the 'use cases' part of the forum, there were very few harsh tones, and the Code of Conduct was enforced when needed. People seem genuinely excited to see what this new project can create, and people believe there's a there there.

Documentation

It is clear that the IPFS team cares about clear communication and intentionally building a system that makes sense. Time is spent making documentation and explanations in issues and discussion boards as clear and as straightforward as possible. For a project as potentially esoteric as basic internet protocols (and in a world where buzzy blogs about blockchain are a dime a dozen) I'm sure this has been key to its adoption so far.

Road-tested, piece by piece

And of course, the thing actually works. It looks as through the IPFS team has been methodically building and testing (in the real world) and iterating on pieces of the entire system since the beginning, which is the only way to truly understand if we're heading in the right direction. The team commitment to this approach is admirable.

Weaknesses and recommendations

Several small points to clarify for users

Topic-wise, two areas seem to be causing the most confusion, and generating the most interest, in the use case discussion area. The team likely needs more, or different, communication in the near term in these areas:

How and when data is available, and how (or if) that relates to data size. How might IPFS and related projects might be used to ensure data availability? How is this impacted when data sizes are very large? There seems to be some core confusion around concepts such as FileCoin versus IPFS nodes/pinning, and how data exists (or doesn't), how storage interacts with existing data storage systems (or doesn't), etc.
How private swarms, public data, private data might interact in IPFS. What is the vision for public versus private data in the IPFS system? Or, what are some scenarios for all-public; part-public; all-private data in the product system?

Crossing the chasm

More importantly, IPFS seems to be ready to move its product out of the domain of early adopters towards more mainstream use. As noted in its roadmap, it needs to move from a promising (but perhaps unreliable and difficult to understand) specialist experiment to something that a more novice person would be interested in exploring (and hopefully contributing back to IPFS's learning and iteration cycle as more use cases are tested and proven in the real world).

This means that the team is going to be moving from a community that they themselves are a part of, and understand relatively well, towards new markets and communities that it understands less well (ie, as the team moves farther away from solving its own problems, it needs to empathize and understand clearly other peoples' challenges, while limiting the impact of its own biases on product decisions). It also means that the team will move from fixing relatively obviously problematic issues (basic functionality) to focusing more time and energy on the problems that are most blocking for other people; the team will need to make decisions based on other peoples' success or failure in the IPFS product system.

To make sure this decision pathway is clear, I would recommend moving even further towards outcomes-based metrics for product tracking. For example, consider leading planning with metrics-linked hypothesis statements. Here are a few examples of what I mean that draw from existing project OKRs (I am sure what I've pulled together is not exactly what you have in mind, but they hopefully serve as examples):

We believe that: starting a regular distributed web meetup in London
Will: unlock the flow of amazing information and data points
We will know that we are right when: we get actionable feedback from this web meetup that informs product choices

We believe that: for go-ipfs, 10 performance regression tests; fast DHT (score = 15 - latency), automated performance testing (runs on every PR), and having badger enabled by default
Will: improve the user experience for X user type and increase adoption
We will know that we are right when: we watch X user type and see them succeed with little friction, see fewer performance-related reports in the discussion forum, and see an increase in some adoption/retention measure

This might not be the exact format that works for the IPFS team, and the examples I chose are certainly not the right level of detail. However, in general, documenting and sharing the 'why' (outcome) of product decisions, and then linking that closely to ways to measure success, helps the team focus on the most important issues to tackle towards user experience goals.

To figure out where the most important 'whys' are, I would also recommend illustrating more systems-level user journeys. For example, right now, the "thing" that a non-early-adopter might do with the connected pieces of the IPFS ecosystem, at a rubber-meets-the-road level, is not clear. Of course, some of that is still being developed and will be better understood as the team goes along. But IPFS is probably far enough along to illustrate several user journeys, and then use those journeys to understand, by talking with users, where the team got it right; where the journey is actually a little different; where user painpoints are along the way; etc. This will also help align the distributed team and non-core contributors around a vision for the product system by providing a 'single source of truth' for understanding the forest even though each person is working on a tree, and helps everyone stay focused on the most urgent user experience "trees" in the forest.

Opportunity

Innovation often comes at the edge, crossover specialties. Just as IPFS is an innovation that, in some respects, combines the practices and conversations that were happening separately in data distribution and version control, the IPFS team could prioritize efforts that make it easier for specialists in tangential fields and tangential communities to be a part of the conversation. Increasing diversity and inclusion in the community of practice this early on in the development cycle will help prevent blind spots in project assumptions.

Confession

This took more than 4 hours. But it was an incredibly interesting journey, and I couldn't leave at the 4-hour mark. It took about 4 hours for me, personally, to get to a place where I could grok the background material and use case discussion enough to synthesize what I'd learned into something even mildly useful. Between reading, thinking, and writing, I'd call this closer to 8 hours of work.

meiqimichelle/ipfs-in-4-hours.md