The following is the proposed markdown and HTML specs for unordered lists, e.g. UL and ordered lists, e.g. OL.
Here’s the proposed spec for unordered lists:
- Item
- Item
- Item
- Item
- Item
- Item
- Item
- Item
- Item
And for ordered lists:
1. Item
1. Item
2. Item
2. Item
1. Item
1. Item
2. Item
2. Item
3. Item
Where the regex search pattern is \t*(?:-|\d+(?:\.|\)))[ \u{00a0}]?
:
\t* // Zero or more tabs
(?: // AND
- // Hyphen
| // OR
\d+ // One or more numbers
(?: // AND
\. // Period
| // OR
\) // Closing parenthesis
) // END
) // END
[ \u{00a0}]? // Optional space or non-breaking space
Note that 1.
or 2.
, etc. can be used interchangeably.
It should also be noted that Medium and Notion have naive lexers; Medium doesn’t allow nesting or recursive nesting for sequential lists, and while Notion does, Medium and Notion use naive number counters.
Also, Medium left-aligns the counter whereas Notion does not.
It seems that the optimal balance would be to have left-aligned, recursive lexing, agnostic the number of tabs or number used.
The HTML output, for the above example, should look something like this:
<ul>
<li>Item</li>
<ul>
<li>Item</li>
<li>Item</li>
</ul>
<li>Item</li>
<ul>
<li>Item</li>
<ul>
<li>Item</li>
<li>Item</li>
</ul>
<li>Item</li>
</ul>
<li>Item</li>
</ul>
It should be noted that there’s two edge cases to be mindful of:
Markdown:
- Item
- Item
HTML:
<ul>
<li>Item</Item>
<ul>
<ul>
<li>Item</li>
</ul>
</ul>
</ul>
And the opposite:
Markdown:
- Item
- Item
HTML:
<ul>
<ul>
<ul>
<li>Item</li>
</ul>
</ul>
<li>Item</Item>
</ul>
Both of these syntaxes should be valid given Codex’s flavor of CommonMark is designed to be as intuitive and flexible as possible. One such example of this is that underscores share the same behavior as asterisks, and are not interpolated as plaintext when not space-delimited.