First, the markdown goes through the tokenizer.
The tokenizer splits by paragraphs, which gives this kind of structure:
(("text of paragraph 1") ("text of paragraph 2"))
Then, the tokenizer finds the special characters to split up words. There is then this kind of structure: