Skip to content

Instantly share code, notes, and snippets.

@harshildarji
Last active August 29, 2019 11:18
Show Gist options
  • Select an option

  • Save harshildarji/c8edb7c07506de339ac210495d4a135d to your computer and use it in GitHub Desktop.

Select an option

Save harshildarji/c8edb7c07506de339ac210495d4a135d to your computer and use it in GitHub Desktop.
Convert GloVe to Gensim Word2Vec
# Usage: python convert.py glove_model.txt
import sys
import time
def file_len(fname):
line = col = 0
with open(fname, encoding = 'UTF-8') as f:
for line, l in enumerate(f):
if col == 0:
col = len(l.split(' ')) - 1
return line + 1, col
if __name__ == '__main__':
if len(sys.argv) == 1:
print('File name missing.')
sys.exit(1)
print('Converting...')
fName = sys.argv[1]
start = time.time()
lines, col = file_len(fName)
f = open(fName, 'r', encoding = 'UTF-8')
head = str(lines) + ' ' + str(col) + '\n'
oldLine = f.readlines()
oldLine.insert(0, head)
f.close()
f = open(fName, 'w', encoding = 'UTF-8')
f.writelines(oldLine)
f.close()
end = time.time()
print('Done.\nTime: {} m'.format((end - start) / 60))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment