- act2vec, trace2vec, log2vec, model2vec https://link.springer.com/chapter/10.1007/978-3-319-98648-7_18
- apk2vec https://arxiv.org/abs/1809.05693
- app2vec http://paul.rutgers.edu/~qma/research/ma_app2vec.pdf
- ast2vec https://arxiv.org/abs/2103.11614
- attribute2vec https://arxiv.org/abs/2004.01375
- author2vec http://dl.acm.org/citation.cfm?id=2889382
- baller2vec https://arxiv.org/abs/2102.03291
- bb2vec https://arxiv.org/abs/1809.09621
Here are the areas I've been researching, some things I've read and some open source packages...
Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model
Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf
Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/