Skip to content

Instantly share code, notes, and snippets.

@bmschmidt
Created April 14, 2016 21:21
Show Gist options
  • Save bmschmidt/8057b9d5d22d76b6d4d03bbeab162043 to your computer and use it in GitHub Desktop.
Save bmschmidt/8057b9d5d22d76b6d4d03bbeab162043 to your computer and use it in GitHub Desktop.

An idea that I proved unable to express in the number of characters on Twitter:

Train two word2vec models on the same corpus with 100 dimensions apiece; one with window size 5, and one with window size 15 (say).

Now you have 2 100-dimensional vector spaces with the same words in each.

That's the same as 1 200-dimensional vector space: you just append each of the vectors to each other.

That vector space has all the information from each of the original models in it: you can just use linear algebra to flatten it out along either of the original 100 degree vectors.

But now you also have ability to define lines between the two spaces. Some words will be close in the smaller window ("syntactically", roughly) and some close in the larger one ("semantically"). The size of that difference--which should be extractable, somehow--might make it possible to estimate what the relative distances or positions of vectors in a third window size would be; even a window size of 25 (larger than anything we modeled) or 2 (smaller than anything we modeled).

Maybe it could even be trained in a way that there would be a linear regularity to this estimation: so that by starting from a larger dimensionality, you could "zoom" window sizes in the same model through projecting into smaller spaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment