-
-
Save lampts/026a4d6400b1efac9a13a3296f16e655 to your computer and use it in GitHub Desktop.
# required tensorflow 0.12 | |
# required gensim 0.13.3+ for new api model.wv.index2word or just use model.index2word | |
from gensim.models import Word2Vec | |
import tensorflow as tf | |
from tensorflow.contrib.tensorboard.plugins import projector | |
# loading your gensim | |
model = Word2Vec.load("YOUR-MODEL") | |
# project part of vocab, 10K of 300 dimension | |
w2v_10K = np.zeros((10000,300)) | |
with open("./projector/prefix_metadata.tsv", 'w+') as file_metadata: | |
for i,word in enumerate(model.wv.index2word[:10000]): | |
w2v_10K[i] = model[word] | |
file_metadata.write(word.encode('utf-8') + '\n') | |
# define the model without training | |
sess = tf.InteractiveSession() | |
with tf.device("/cpu:0"): | |
embedding = tf.Variable(w2v_10K, trainable=False, name='prefix_embedding') | |
tf.global_variables_initializer().run() | |
saver = tf.train.Saver() | |
writer = tf.summary.FileWriter('./projector', sess.graph) | |
# adding into projector | |
config = projector.ProjectorConfig() | |
embed= config.embeddings.add() | |
embed.tensor_name = 'fs_embedding:0' | |
embed.metadata_path = './projector/prefix_metadata.tsv' | |
# Specify the width and height of a single thumbnail. | |
projector.visualize_embeddings(writer, config) | |
saver.save(sess, './projector/prefix_model.ckpt', global_step=10000) | |
# open tensorboard with logdir, check localhost:6006 for viewing your embedding. | |
# tensorboard --logdir="./projector/" |
Fix this error "'Word2Vec' object has no attribute 'wv'" by change model.wv.index2word -> model.index2word (remove 'vw). My gensim version is 0.13.3
Thanks.
Tx, I have updated required gensim version and how to load it using tensorboard in CLI.
Ran into the same issue on TF 1.1.0. Tried different model sizes, but experienced the same issue ("loading ...").
@cbienpourtoi Did you have any luck getting it to work?
For anyone stuck in the "loading...", I got it to work by setting the tensor name the same as the 'name' parameter, for example changing:
embed.tensor_name = 'fs_embedding:0'
to
embed.tensor_name = 'prefix_embedding'
My code had other changes, but that is probably the relevant part.
Hey @rodrigoccurvo , thanks so much for the fix, that did it. However I can only see the indices instead of the actual words, did you run into a similar issue by any chance?
Hi @rodrigoccurvo, thank you for sharing the fix. Great embedding visualization!
Hi @lampts can you please add the dependency of numpy in code.
Thanks
@rodrigoccurvo Hey, can you tell me what other changes you needed to input? I still am stuck at loading...
Here are several small errors such as metadata file path, I have fixed at here https://gist.github.com/BrikerMan/7bd4e4bd0a00ac9076986148afc06507#file-w2v_visualizer-py. Thanks a lot~
@Adamage fixed the loading problem.
@BrikerMan I get the error "w2x_metadata.tsv is not a file" with your fixes. Couldn't figure out what could be the cause; any ideas?
EDIT: modifying the original as commented above worked in the end. Cheers.
The cause of my "Point: Loading... | Dimensions: Loading..." error was related to the fact that my laptop doesn't support WebGl...Dunno if it can be someone else's reason of failure though.
Also be aware that if you're running tensorboard in the Virtualbox lots of functionalities will not work.
I got this error: AttributeError: 'Word2Vec' object has no attribute 'wv'
What is your gensim version?