TLDR: Most programming languages are designed syntax-first, but this approach leads to semantic compromises and resistance to evolution. Following Lisp's example, we should start with a formal semantic model using structured data formats and let syntax emerge as views into that model. This semantic-first approach enables better reasoning about programs and more flexible language evolution.
Most programming languages start with syntax design - carefully crafting keywords, operators, and grammar rules that will form the surface-level interface developers interact with. But we have strong historical evidence that this might be backwards.
Consider Lisp, one of our most influential programming languages. Over 60 years ago, John McCarthy developed Lisp by first defining a mathematical model of computation based on lambda calculus and recursive functions. The famous S-expression syntax came later as a practical notation for representing these abstract concepts. This foundation in mathematical semantics is why Lisp's core ideas have remained powerful and influential despite decades of language evolution. The syntax was a view into the semantics, not the other way around.
Yet today, most language designers start with syntax. Why do we continue this approach when we have such a compelling counter-example? What if this common approach is fundamentally backwards?
Consider how we validate program correctness. When proving properties about a program, we don't reason about the pretty syntax - we work with the underlying semantic model. The syntax is just sugar coating on top of the actual meaning. Yet we spend enormous effort designing this coating before we've fully understood what we're trying to express.
This syntax-first approach leads to several problems:
First, it often results in semantic compromises to maintain syntactic elegance. We end up with language features that look nice but have subtle edge cases or unexpected behaviors because the semantics had to be bent to fit the chosen syntax.
Second, it creates resistance to semantic changes. Once developers are used to certain syntax patterns, evolving the underlying semantics becomes much harder because we feel constrained by backwards compatibility with existing syntactic expectations.
Third, it puts the focus on the wrong layer of abstraction during the critical early design phase. Instead of deeply exploring what we want programs to mean and what properties we want to prove about them, we get caught up in debates about keywords and operator precedence.
What if we followed Lisp's example and flipped this around? What if we started by defining the semantic model - the actual meaning we want to capture - and only later designed syntax as different views into that semantic space?
This semantic-first approach would let us:
- Focus on core meaning before surface representation
- Explore multiple syntactic views of the same semantic model
- Evolve syntax more freely since it's just a view layer
- Ground our design in provable properties from the start
The tooling exists for this approach. We can leverage structured data formats to precisely specify semantic models while deferring syntax decisions:
- General purpose formats like JSON+Schema, YAML, or XML
- Language-specific data structures in Python, Haskell, OCaml, or other implementation languages
- Formal specification languages and proof assistants
Modern development environments can handle multiple syntactic views of the same underlying program, regardless of the chosen format.
In compiler design particularly, this semantic-first approach makes even more sense. A compiler's job is to understand program meaning and transform it while preserving semantics. Starting with a clear semantic model lets us focus on these core transformations before getting caught up in parsing details.
Some might argue that syntax is critical for adoption and developer experience. This is true! But good syntax should emerge from a clear semantic model, not constrain it. By starting with semantics, we can design syntax that naturally maps to the underlying meaning rather than fighting against it.
The conventional wisdom of syntax-first design may be backward. Perhaps it's time to follow Lisp's example - start with meaning, prove our properties, and let elegant syntax emerge naturally from a solid semantic foundation.
What do you think? Have you worked on language design projects that started with semantics rather than syntax? How did it affect the final result? Let's discuss different approaches to language design and what we might learn from them.
#LanguageDesign #Compiler #CompilerTheory #CompilerDesign #SemanticProgramming