Skip to content

Instantly share code, notes, and snippets.

@Seanny123
Created October 28, 2020 18:26
Show Gist options
  • Save Seanny123/85c8e7c05fbe3f432003510ca5af80fb to your computer and use it in GitHub Desktop.
Save Seanny123/85c8e7c05fbe3f432003510ca5af80fb to your computer and use it in GitHub Desktop.
Break CSV into N chunks using Python
from tqdm import tqdm
def chunker(seq, size):
for pos in range(0, len(seq), size):
yield seq[pos:pos + size]
with open("input.csv") as csv_fi:
csv_lines = csv_fi.readlines()
field_names = csv_lines[0].split(",")
for c_i, chunk in enumerate(chunker(csv_lines[1:], len(csv_lines) // 10)):
with open(f"output-{c_i}.csv", "w") as out_fi:
out_fi.write(",".join(field_names))
for row in tqdm(chunk):
out_fi.write(row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment