Skip to content

Instantly share code, notes, and snippets.

@jvantuyl
Last active April 10, 2025 17:27
Show Gist options
  • Save jvantuyl/5af744ee99c1938cc47d4b3b35809330 to your computer and use it in GitHub Desktop.
Save jvantuyl/5af744ee99c1938cc47d4b3b35809330 to your computer and use it in GitHub Desktop.

A Journey Though Elixir's Syntax

This was written in response to this Reddit post. Which asked:

defmodule User do
  @enforce_keys [:name]
  defstruct [:name, age: 0, email: nil]
end

# Creating a user
user = %User{name: "Alice", age: 30, email: "[email protected]"}defmodule User do
  @enforce_keys [:name]
  defstruct [:name, age: 0, email: nil]
end

# Creating a user
user = %User{name: "Alice", age: 30, email: "[email protected]"}

How is this struct taking a name: when in the definition its an atom?

An Underlying Principle

There's an underlying principle at work here that you might be missing. Others are right to point out that this is "syntactic sugar", but that makes it sound like this is only a matter of where you put the punctuation. That's not wrong, but let me put a different spin on it for you.

An Example

Allow me, if you will, to take you on a strange journey. Let's say you're building a device. We're not just going to build that device, we're going to modify our language to make the code that does so pretty.

But let's not get ahead of ourselves. Let's say that this device has a bunch of LED lights on it and we want to allow them to be controlled programmatically. Furthermore, let's say each light is on a button. You want to allow it to be configured such that pressing a button trigger an action.

For actions, let's say we support:

  • play a sound from a file retrievable from some URL
  • cycle the light's color
  • send a POST request to a webhook at some URL

Let's also allow somebody to turn it off, put it in standby mode, or turn it on via this interface.

Writing a Configuration

How might you represent the configuration for this? Let's start with some JSON:

{
  "power_state": "on",
  "button_config": [
    {"mode": "cycle_color", "current": "red", "colors": ["red", "blue", "green", "white"]},
    {"mode": "play_sound", "url": "https://example.com/sounds/clown_horn.wav"},
    {"mode": "webhook", "url": "https://example.com/webhook", "data": {"button": 3}}
  ]
}

Let's take a look at this data structure. Notice all of the strings. Are they really being "string-like"? Are they acting as a sequence of characters or are they being used some other way?

Most of these strings are just acting as sort of "keys" or "enumerations" rather than a general string. In fact, out of all of the 22 strings above, only 2 of them are actually representing a general string value in the form of URLs. This makes sense somewhat because URLs are free-form strings.

But what about all of those keys? Is this efficient? Is it error-prone? How can we improve things.

First and foremost, these are very wasteful. Consider the colors. If we used integers, it would take less space then these color names. We could even store an RGB triple in less space! Would that make it harder to read the configuration though?

And so many of these are duplicated! Are we storing three copies of "mode" and two copies of "url"? Do we have to read the entire string two compare two of them?

Or, maybe they're pointers to a shared string to save space. How do we ensure that rewriting one doesn't change the other one? Can we do this faster by comparing pointers? Is that safe or can we end up with two pointers to two copies of the same string?

Now Make It Functional

This all sounds complicated. Why are we using strings in the first place? The answer is... because it's the best data-type that we have available. Using booleans or floats doesn't really have the right semantics. Using just integers makes it not human-readable. Is there a better way?

What if we had a data type that was just for representing "identifiers"? You can find these in Ruby as "symbols". They're just a data type to represent a symbol. So they read like a string, but they don't behave like one. They're something more efficient and smarter under-the-hood. Ruby didn't invent these either, as you can find them in functional languages as... atoms!

Now let's translate this into a more functional form instead of using JSON. We'll rewrite the data structure using the following guidelines:

  • The bits that are legitimately strings, let's leave those as strings.
  • However, let's use atoms everywhere we need a key or an enumeration.
  • Maps are a bit overkill. Why do we want to use a full-on hash-table for something that always has the same keys?
  • Let's just use a list and we'll make each key-value pair just a tuple of an atom and the value.

That works out something like this:

%{
  :power_state => :on,
  :button_config => [
     [{:mode, :cycle_color}, {:current, :red}, {:colors, [:red, :blue, :green, :white]}],
     [{:mode, :play_sound}, {:url, "https://example.com/sounds/clown_horn.wav"}],
     [{:mode, :webhook}, {:url, "https://example.com/webhook"}, {:data, [{:button, 3}]}]
  ]
}

Firstly, this is much more efficient. In Elixir, each "atom" is allocated once and assigned an integer value. Suddenly you don't need to worry about pointers or variable-length storage. Comparison is trivial now. You just store the atom number everywhere and it corresponds to that string.

It's also less error-prone. There aren't concerns about dangling pointers or making sure you're doing comparisons right. The only real drawback is that they aren't really easy to garbage-collect. Then again, pointers to shared strings wouldn't have been either.

Okay, but it's kind of ugly. It's much clearer on what's a string and what is a key or enumerated value. How can we do this but maybe get some syntactic sugar to help the pill go down?

Give Me Some Sugar

Since this pattern of using atoms as keys is likely to be common, let's shorten it a bit. The braces aren't buying us much and the comma really implies a sequence when we're going more for an association between those bits. Let's drop the braces, keep commas between the individual key-value pairs, and use a colon between the two. So {:current, :red} becomes current: :red.

The above would then look like this:

[
  power_state: :on,
  button_config: [
    [mode: :cycle_color, current: :red, colors: [:red, :blue, :green, :white]],
    [mode: :play_sound, url: "https://example.com/sounds/clown_horn.wav"],
    [mode: :webhook, url: "https://example.com/webhook", data: [button: 3]]
  ]
]

As it turns out, this is exactly what Elixir does. If you cut-and-paste the first functional example above into the iex shell, it gives exactly this output! These are called keyword lists.

Maybe I Was A Bit Hasty

Maybe we were hasty about throwing out maps above, though? Elixir represents small maps pretty well, so maybe it'll be okay to add them back. So that looks something like:

%{
  :power_state => :on,
  :button_config => [
    %{:mode => :cycle_color, :current => :red, :colors => [:red, :blue, :green, :white]},
    %{:mode => :play_sound, :url => "https://example.com/sounds/clown_horn.wav"},
    %{:mode => :webhook, :url => "https://example.com/webhook", :data => %{:button => 3}}
  ]
}

This is alright but there are problems.

Tighten It Up

Firstly, we now have the same problems with maps that we had with strings. Most of these maps aren't really maps, but they're kind of a map that's structured a certain way for a certain use. How do we tell them apart from just normal maps? For example, that data map in the webhook setting. It's there to be formatted into JSON, so it's a generic map rather than something more meaningful.

How about we pick a standard key to use in these maps. Let's pick one that is unlikely to conflict with anything someone might use. And let's use an atom as the value, since the types of structured maps we might need are probably strictly enumerated.

Since we may want to store some information about them somewhere, let's use module names for these atoms (because module names are just atoms written with a certain convention). In fact, let's just define a module to hold this information.

This will be commonly done, so let's define the structure for these maps in a sane way. Let's make a macro that does it. Since we really just need a list of keys, let's provide that as a parameter to that macro.

So the top-level config might look like:

defmodule DeviceConfig do
  defstruct [:power_state, :button_config]
end

Let's say we design things so that all of these keys are required and must be given when making one of these maps.

This makes sense, but will that work for our button configs? Well... not exactly. Our button configs don't always use the same keys. We could allow some keys to be missing, but that kind of defeats the purpose of defining a structure, no?

Instead, maybe we can just set them to some default value and use the atom :nil to mark the empty ones. Since we're going to need a value per key, let's just reuse our pretty keyword list syntax. And since it's still a list, let's just use the same macro.

Then again, we always want the mode key, don't we? Let's use the module attribute syntax to denote which keys are required.

And let's jazz it up my making the parenthesis for the macro call optional. We have enough context to infer what's going on? Let's put that parser to work!

So we might define the button-config like this:

defmodule ButtonConfig do
  @enforce_keys [:mode]
  defstruct mode: :nil, url: :nil, current: :nil, colors: :nil, data: :nil
end

Give Me MOAR SUGAR!

Now that we have our information about how to structure these maps, let's switch those keyword-lists back to maps, tag the maps with the module that describes them, and put in the missing keys so they have a uniform structure:

%{
  :__struct__ => :"Elixir.DeviceConfig",
  :power_state => :on,
  :button_config => [
    %{:__struct__ => :"Elixir.ButtonConfig", :mode => :cycle_color,
      :url => :nil, :current => :red, :colors => [:red, :blue, :green, :white], :data => :nil},
    %{:__struct__ => :"Elixir.ButtonConfig", :mode => :play_sound,
      :url => "https://example.com/sounds/clown_horn.wav", :current => :nil, :colors => :nil, :data => nil},
    %{:__struct__ => :"Elixir.ButtonConfig", :mode => :webhook,
      :url => "https://example.com/webhook", :current => :nil, :colors => nil, :data => %{:button => 3}}
  ]
}

Well... that's ugly as sin again, isn't it? Let's add some more sugar.

Let's use the keyword-list syntax but put it inside of the map. That won't work for keys that aren't atoms, but we don't really want to use those to structure these maps.

Also, let's not display the missing values. We're going to need to indicate missing values pretty often, so let's just make the keyword nil to represent that same atom and save a colon. We'll also take the module name atoms and just represent them by stripping off the Elixir prefix and getting rid of the quotes.

Finally, tagging these maps this way seems a bit verbose. Let's just put the module name atoms somewhere. We aren't using the space between the % and the { in the maps for anything, so let's put it there.

That works out to this:

%DeviceConfig{
  power_state: :on,
  button_config: [
    %ButtonConfig{
      mode: :cycle_color,
      url: nil,
      current: :red,
      colors: [:red, :blue, :green, :white],
      data: nil
    },
    %ButtonConfig{
      mode: :play_sound,
      url: "https://example.com/sounds/clown_horn.wav",
      current: nil,
      colors: nil,
      data: nil
    },
    %ButtonConfig{
      mode: :webhook,
      url: "https://example.com/webhook",
      current: nil,
      colors: nil,
      data: %{button: 3}
    }
  ]
}

Maybe we can make it a bit less verbose by just removing the nil values, so the final version will be:

%DeviceConfig{
  power_state: :on,
  button_config: [
    %ButtonConfig{mode: :cycle_color, current: :red, colors: [:red, :blue, :green, :white]},
    %ButtonConfig{mode: :play_sound, url: "https://example.com/sounds/clown_horn.wav"},
    %ButtonConfig{mode: :webhook, url: "https://example.com/webhook", data: %{button: 3}
    }
  ]
}

It's not perfect. You can't specify more complicated constraints on these maps. Still, you get the basic structure of them down, it works as a general-purpose starting point, and it looks pretty.

As it turns out, the above is valid Elixir syntax and behaves exactly as we described. These are the structs that we all know and love.

Tie It All Together

So there you have it. Elixir, at the core, gives you very powerful, fundamental data structures. It also gives you the syntactic tools that keep your eyes from bleeding, too. That's the underlying principle that I alluded to above: "Elixir has just enough data-types to represent what you need, conventions that help keep the clarity, and the syntactic sugar to make it all pretty."

Keyword lists and Structures aren't the only special cases. There are a few others I didn't mention (see if you can guess them). Eventually you'll learn to look through the pretty shortcuts to see the real data you're creating underneath. And when you do, most of these questions just evaporate.

If you're curious, many special forms of Elixir's syntax are documented in Kernel.SpecialForms. You can find documentation on structs there. You can also look at the Keyword docs for more details on keyword lists.

Also, even thought the above device sounds kind of primitive, you absolutely can build this kind of thing with Elixir. Check out The Nerves Project for the slickest embedded development experience around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment