An experiment on visualizing linguistic annotations of a (small) corpus.
The example uses a nonstandard, super-simple JSON format coded by hand (please forgive me for the errors I surely made from a linguistic standpoint).
This visualization focuses on three different aspects of the analysis: sentence splitting (a gray ■ introduces a new sentence), tokenization and lemmatization (each token has an underline and its lemma written under it) and part-of-speech tagging (the color of the underline and the lemma indicates whether the term is a noun, a verb, etc.).
The original text's spacing, punctuation and line breaking is preserved, as it can be seen by the last two lines.
Various CSS hacks with line heights, relative positioning and stuff are used to create this layout, so functionalities like text selection and similar are broken.