Skip to content

Instantly share code, notes, and snippets.

@rohitdholakia
Last active December 29, 2015 18:49
Show Gist options
  • Select an option

  • Save rohitdholakia/7713717 to your computer and use it in GitHub Desktop.

Select an option

Save rohitdholakia/7713717 to your computer and use it in GitHub Desktop.
Script to load all ngrams as keys to redis
import time
import os
import redis
path = '/path/to/unigrams'
client = redis.Redis(host = 'host-ip-here', port = 6385, db = 0)
pipeline = client.pipeline(transaction = False)
for f in os.listdir(path):
print ' starting with file ', f
start = time.time()
with os.popen("zcat " + os.path.join(path,f)) as ngram_file:
for index, line in enumerate(ngram_file):
parts = line.strip().split('\t')
word, year, raw_count, volume_count = parts
pipeline.hset(word, year, (raw_count, volume_count))
if index > 0 and index%10000 == 0:
pipeline.execute()
print ' file ', f, 'is is done in ', time.time() - start, ' seconds '
print ' key_size now is ', client.dbsize()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment