Skip to content

Instantly share code, notes, and snippets.

@kevinrobinson
Last active December 9, 2015 00:39
Show Gist options
  • Select an option

  • Save kevinrobinson/3b6abe40076105731a7b to your computer and use it in GitHub Desktop.

Select an option

Save kevinrobinson/3b6abe40076105731a7b to your computer and use it in GitHub Desktop.
# build
bazel build -c dbg tensorflow/models/embedding:all
# copy built gen_word2vec.py file from bazel-out/../genfiles into the project
# in word2vec.py, comment out the division line to fix python 2/3 error:
# error: TypeError: unsupported operand type(s) for /: 'Tensor' and 'int'
# fix: comment out from __future__ import division
# in word2vec.py, change the import to pull in gen_word2vec from the local folder:
# before: from tensorflow.models.embedding import gen_word2vec as word2vec
# after: import gen_word2vec as word2vec
# don't run the embedding model from the tensorflow source tree
# copy the embedding folder elsewhere (eg., ~/projects/word2vec/embedding)
# download the training and dev data sets
# make a folder to hold the output
# run it!
wget https://word2vec.googlecode.com/svn/trunk/questions-words.txt
wget http://mattmahoney.net/dc/text8.zip
unzip text8.zip
mkdir -p output
python word2vec.py --train_data=text8 --save_path=output --eval_data=questions-words.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment