Alphabetize your set items and map keys

This document looks at certain data-structures and makes the case for alphanumeric ordering to optimize developer experience (DX).

Strategies for organizing code

Authoring code, like all language, is about expressing ideas intentionally and legibly. Obviously code needs to be correct for the computational device, and succeed as a software product, but we also want to make it comprehensible to human colleagues.

To make code comprehensible I can identify four distinct organizational strategies:

Hierarchical
Clustering/grouping
Insert next
Alphanumeric

Hierarchical organisation

Organizing code hierarchically is very subjective and context dependent, relying on an intuitive grasp of what is most to least important. Author and reader may disagree, but in language and framework communities often arrive at a shared understanding of how to s mirrors actual logic of the code)

Clustering/grouping

Co-locating similar or related things.

Insert next

Just add new stuff to the bottom

Alphanumeric

The only option which is both regular and predictable, since it is governed by simple, well understood rules for anyone proficient in a given language.

How strategies are involved

The first person to create a file generally selects one or more of these strategies and makes that first commit. Before making that first commit the choices are relatively inconsequential.

Each language, framework, and specific aspects of a given application will naturally suggest prioritizing certain strategies. This document will illustrate why Sets and Maps generally benefit from alphabetical organization using JavaScript as the example language.

Sets and Maps in JavaScript

The Set data structure is new to the language, but this advice could apply to any Set-like array (i.e. an array with unique items).

const formalSet = new Set(['Jack', 'King', 'Queen', 'Joker'])
const informalSet = ['Jack', 'King', 'Queen', 'Joker'] // while not a formal Set is functionally equivalent

(Aside: JavaScript arrays inherently support "List" behavior in that they have a specific order, i.e. the order isn't a code formatting issue, it's a logical issue.)

When referring to an "Map" in this document I mean generally any Map-like data structure. So anything concerned with mapping keys to values, including "plain old JavaScript objects" but not Class definitions (although these could follow this advice in parts but probably not strictly as a whole).

We think about strategy because code will change

If a given file is never touched after the first commit then whatever initial choices the author made will probably be good enough. However, and this may shock you, software tends to change. The bigger the codebase, and the more contributors it has, the more consequential the choice (or non-choice) of code organizing strategies becomes.

In many cases static definitions of data structures can tend to be small so prescribing a certain organization to them can initially seem like bizarre pedantry. However there are often parts of the codebase (like configuration) where statically defined data-structures can be large, or frequently changing. The inspiration for this document comes from props in React, but there are many, many others examples.

The main goals of choosing a particular strategy involve:

Minimize choice overhead
Maximise code comprehensibility
Minimize code conflicts (especially with parallel development)

Problems with hierarchical and clustering in data structures

When organizing data structures I tend to notice engineers intuitively follow a combination of logical-hierarchy and clustering. This makes sense for functions, methods, variables, that have drawbacks as data structures grow or change.

As the number of Set items and Map keys grows, the subjectivity increases, intuition diminishes, confusion increases. As you search a data structure for a particular line, it's harder to guess where it will be, and therefore hard to choose where new code should be inserted.

Because of the insertion problem, when modifying code, I tend to notice engineers gravitate to "insert next".

Problems with insert next

An obvious drawback for insert next is that it's a form of arbitrariness, i.e. the reader generally doesn't care when code was added (an if they do then source control history can answer that question much better).

A seeming advantage of insert next is that it's very simple for writing code — just scroll to the bottom and go. However, when you consider parallel development this becomes less true.

Francesca + Jose are both working on the same lines of code in parallel — they both branch from main where set and map are in this state:

const set = new Set([
  'experiment',
  'title',
  'body',
  'wrapper',
  'head',
  'francesca-inserts',
]);

const map = {
  nextTitle: true,
  counter: 13,
  onExitApp: () => { window.location = '/go/here' },
  displayNext: (curent) => `${current}+`,
  onCleanup: (state) => JSON.stringify(state),
};

Francesca adds to set and map:

const set = new Set([
  'experiment',
  'title',
  'body',
  'wrapper',
  'head',
+ 'francesca-inserts',
]);

const map = {
  nextTitle: true,
  counter: 13,
  onExitApp: () => { window.location = '/go/here' },
  displayNext: (curent) => `${current}+`,
+ onCleanup: (state) => JSON.stringify(state),
};

Jose adds to set and updates map:

const set = new Set([
  'experiment',
  'title',
  'body',
  'wrapper',
  'head',
+ 'jose-inserts',
]);

const map = {
  nextTitle: true,
  counter: 13,
  onExitApp: () => { window.location = '/go/here' },
- displayNext: (curent) => `${current}+`,
+ displayPrevious: (curent) => `${current} 🌞`,
};