UMAP maps of internal OpenCLIP ViT-H/14 text representations. Colors are derived from k_means clustering on the unprocessed (1024-dimensional) CLIP feature tensors.
This variant of CLIP is used in StableDiffusion version 2.x. (
"in the style of <artist>":
Wordlists are unchanged from the previous (L/14) maps. Therefore: the noun/verb/adjective wordlists are imperfect, and will contain some incorrect words. This is especially true for the 'adverbs' list.