Skip to content

Instantly share code, notes, and snippets.

@streamich
Last active July 28, 2024 12:37
Show Gist options
  • Save streamich/09eb1683080f353df1943efa76b6863e to your computer and use it in GitHub Desktop.
Save streamich/09eb1683080f353df1943efa76b6863e to your computer and use it in GitHub Desktop.
JSON Expression Grammar

JSON Expression Grammar

An improvised draft of JSON Expression grammar in ABNF:

JsonExpression = "[" Operator Operands "]"        ; JSON-expression is itself a valid JSON array

Operator = JsonLiteral                            ; Operator is any JSON value, but usually a string

Operands = 1*("," Operand)                        ; Non-nullary expression has at least one operand

Operand = JsonValue / JsonExpression              ; Operands are values or sub-expressions

JsonValue = NullaryExpression / JsonLiteral       ; Array value or any other non-array JSON

NullaryExpression = "[" JsonLiteral "]"           ; Nullary operator allows array (and any JSON) values

JsonLiteral = <https://www.json.org/json-en.html> ; All rules terminate with valid JSON

The program starts with Operand rule to allow for any JSON value or expression. Or, if explicitly at least one JSON Expression is required, the program starts with JsonExpression rule.

Everything substitutes down to JsonLiteral (and right-recursive JsonExpression call):

JsonExpression = "[" Operator Operands "]"
JsonExpression = "[" JsonLiteral Operands "]"
JsonExpression = "[" JsonLiteral 1*("," Operand) "]"
JsonExpression = "[" JsonLiteral 1*("," JsonValue / JsonExpression) "]"
JsonExpression = "[" JsonLiteral 1*("," NullaryExpression / JsonLiteral / JsonExpression) "]"
JsonExpression = "[" JsonLiteral 1*("," ("[" JsonLiteral "]") / JsonLiteral / JsonExpression) "]"

or

Operand = JsonValue / JsonExpression
Operand = NullaryExpression / JsonLiteral / JsonExpression
Operand = ("[" JsonLiteral "]") / JsonLiteral / JsonExpression

Debug format

Consider we want to create a human-readable "debug" format for JSON Expressions. It will follow LISP-like s-expression bracketed syntax.

For example a JSON expression like

["len",                  // Get lengths
  [".", "foo", "bar"]]   // Concatenate strings

would become

(len                     ; Get lengths
  (. "foo" "bar"))       ; Concatenate strings

Get lengths of an array literal

["len", [[1, 2.2, "3"]]]

would become

(len [1, 2.2, "3"])

JSON operators (quoted strings) are still valid:

("len" [1, 2.2, "3"])

The ABNF grammar for the debug format could be:

JsonExpression = "(" Operator 1*Operand ")"       ; At least one operand is still enforced

Operator = Identifier / JsonLiteral               ; Still support JSON value as operator for completeness

Identifier = [a-z] *[a-z0-9_.-]                   ; Allow string without quotes

Operand = JsonLiteral / JsonExpression            ; Operand simplify, as nullary expression is not needed

JsonLiteral = <https://www.json.org/json-en.html> ; All rules terminate with valid JSON

The syntax above basically introduces round bracket ( ) notation into JSON grammar. Where the value inside the brackets is the operator, possibly unquoted string; and the remaining one-or-more values are operands, with the ability of operands themselves be nested JsonExpression.

After substitutions:

JsonExpression = "(" (Identifier / JsonLiteral) 1*Operand ")"
JsonExpression = "(" (([a-z] *[a-z0-9_.-]) / JsonLiteral) 1*Operand ")"
JsonExpression = "(" (([a-z] *[a-z0-9_.-]) / JsonLiteral) 1*(JsonLiteral / JsonExpression) ")"

and

Operand = JsonLiteral / JsonExpression

Comments

In debug format it could be useful to allow developers to store inline comments. Hence, the debug notation could allow for single line comments which start with a semicolon ; (until the end of line). The comments syntax is not included in the grammar as it is assumed that the lexer strips them out.

  • Shall double slash // also be allowed to start a line comment?
  • Shall the language support multi-line comments?

Invalid syntax

From the grammar it follows: empty expressions and nullary operators (operator without operands) are invalid:

( )           ; empty expression
(operator )   ; nullary operator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment