Skip to content

Instantly share code, notes, and snippets.

@clarksun
Created December 28, 2017 02:58
Show Gist options
  • Save clarksun/ce0393a890ab3e82717b6e21f92cdaae to your computer and use it in GitHub Desktop.
Save clarksun/ce0393a890ab3e82717b6e21f92cdaae to your computer and use it in GitHub Desktop.
python遍历大型dict, 分批事务写进数据库方法, 从frontera里面学来的
def chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i+n]
d = dict()
for chunk in chunks(list(d.items()), 32768):
# 因为batch用了transaction就不能用batch_size
with table.batch(transaction=True) as b:
for k, v in chunk:
b.put(...)
# chunk每次只有32768对k, v, 方便在batch中写数据库, 不然batch太大, transaction用不了
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment