An idea that I proved unable to express in the number of characters on Twitter:
Train two word2vec models on the same corpus with 100 dimensions apiece; one with window size 5, and one with window size 15 (say).
Now you have 2 100-dimensional vector spaces with the same words in each.
That's the same as 1 200-dimensional vector space: you just append each of the vectors to each other.
That vector space has all the information from each of the original models in it: you can just use linear algebra to flatten it out along either of the original 100 degree vectors.