Skip to content

Instantly share code, notes, and snippets.

Created January 10, 2016 18:41
Show Gist options
  • Save bhaettasch/d7f4e22e79df3c8b6c20 to your computer and use it in GitHub Desktop.
Save bhaettasch/d7f4e22e79df3c8b6c20 to your computer and use it in GitHub Desktop.
Use gensim to load a word2vec model pretrained on google news and perform some simple actions with the word vectors.
from gensim.models import Word2Vec
# Load pretrained model (since intermediate data is not included, the model cannot be refined with additional data)
model = Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True, norm_only=True)
dog = model['dog']
# Deal with an out of dictionary word: Михаил (Michail)
if 'Михаил' in model:
print('{0} is an out of dictionary word'.format('Михаил'))
# Some predefined functions that show content related information for given words
print(model.most_similar(positive=['woman', 'king'], negative=['man']))
print(model.doesnt_match("breakfast cereal dinner lunch".split()))
print(model.similarity('woman', 'man'))
Copy link

AttributeError: The vocab attribute was removed from KeyedVector in Gensim 4.0.0.
Use KeyedVector's .key_to_index dict, .index_to_key list, and methods .get_vecattr(key, attr) and .set_vecattr(key, attr, new_val) instead.

Copy link

Google Colab demo

Thank you for the Colab demo !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment