UMAP maps of internal OpenCLIP ViT-H/14 text representations. Colors are derived from k_means clustering on the unprocessed (1024-dimensional) CLIP feature tensors.
This variant of CLIP is used in StableDiffusion version 2.x. (https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K)
"in the style of <artist>": https://gist.githack.com/Pyr-000/90503e58688a7827278ea801096da933/raw/cd7f59e3d7d25800329ef0213d352e7e8ca8e8c9/OpenCLIP_ViT-H14_rev3_artists_st_map_cluster.html
Wordlists are unchanged from the previous (L/14) maps. Therefore: the noun/verb/adjective wordlists are imperfect, and will contain some incorrect words. This is especially true for the 'adverbs' list.