Skip to content

Instantly share code, notes, and snippets.

View bcherny's full-sized avatar

Boris Cherny bcherny

View GitHub Profile
@bcherny
bcherny / designing-data-intensive-application-notes.md
Last active May 5, 2024 19:57
Notes: Designing Data-Intensive Applications

Notes on Martin Kleppmann's excellent Designing Data-Intensive Applications.

Chapter 1: Reliable, Scalable, and Maintainable Applications

  • Data Systems
    • Dimensions to consider when thinking about data systems: access patterns, performance characteristics, implementations.
    • Modern data systems often blur the lines between databases, caches, streams, etc.
  • Reliability
    • Systems should perform the expected function at a given level of performance, and be tolerant to faults and user mistakes
  • Fault: One component of a system deviating from its spec. Prefer tolerating faults over preventing them (except for things like security issues). Faults stem from hardware failures, software failures, and human error (in a study, config errors caused most outages).

graph mutation language (graphml)

             connection
          /              \
         
  Node 1 -----------------> Node 2
         \    edge 1
          \_________________> Node 3
 \ edge 2

Proposal: Mirrored syntax at the type and value-levels

Motivation

Make it easier to express the operations we work with most often:

  • When modeling GraphQL types with Flow: Looking up the type of a field in a nested object type, where every field along the way is nullable and optional
  • When authoring React components: Looking up the type of a prop on another component that we compose, to directly expose it on our component too

Proposed features

Proposal: Mirrored syntax at the type and value-levels

Motivation

Make it easier to express the operations we work with most often:

  • When modeling GraphQL types with Flow: Looking up the type of a field in a nested object type, where every field along the way is nullable and optional
  • When authoring React components: Looking up the type of a prop on another component that we compose, to directly expose it on our component too

Proposed Features

Proposal: @flow strict-readonly

Motivation

Much of our product code uses immutable data structures. This is largely because when working with React (particularly: props and Hooks), accidental mutation is a common source of errors ("why won't this re-render?").

Today, enforcing this immutability is something that happens via a mix of Flow types, ESLint rules, and code review. It results in hard-to-read types like:

type Props = $ReadOnly<{
import * as t from '@babel/types'
const rawQuery = astql`
node {
...on Function {
__typename
}
}
`;
function longestSeq(arr, of) {
let lengths = {}
let max = 0
arr.forEach(n => {
if (lengths[n - of]) {
lengths[n] = lengths[n - of] + 1
delete lengths[n - of]
} else {
lengths[n] = 1
}

On Contagion

If you have a tree with a node that has a property P, and all of its parents also need to have property P, then P is contagious.

When can that happen?

  • Exceptions bubble up through the call tree
  • State has to be lifted up to the root of a call tree
  • If a function is async, its ancestors must be too
  • In a physical system, if you know the position and momentum of an object A and it collides with another object B that you don't know one of those quantities for, then you no longer know them about A. (caveat: I'm no physicist)
/////////////// 1ST CODE SAMPLE (PAGE 140) ///////////////
type Unit = 'EUR' | 'GBP' | 'JPY' | 'USD'
type Currency = {
unit: Unit
value: number
}
let Currency = {