Skip to content

Instantly share code, notes, and snippets.

@james-see
Created June 2, 2023 03:28
Show Gist options
  • Save james-see/d75ce17f6b8aebe2718e0b136184fc0d to your computer and use it in GitHub Desktop.
Save james-see/d75ce17f6b8aebe2718e0b136184fc0d to your computer and use it in GitHub Desktop.
chunk and split up a large json array into smaller
import json
with open('ru2.json') as infile:
o = json.load(infile)
chunkSize = 1000
for i in range(0, len(o), chunkSize):
with open('output/file_' + str(i//chunkSize) + '.json.txt', 'w') as outfile:
json.dump(o[i:i+chunkSize], outfile)
@KanikaNoni
Copy link

Python gives below error while executing
json.dump(i,o[i:i+chunkSize], outfile)
~^^^^^^^^^^^^^^^
KeyError: slice(0, 1000, None)

@james-see
Copy link
Author

because you need to load in the json file you are using. You can see in my case i was using ru2.json.

@KanikaNoni
Copy link

import os
import json
import itertools

with open(os.path.join('C:/Users/abc/Downloads', 'xyz.json'), 'r',
encoding='utf-8') as infile:
o = json.load(infile)
chunkSize = 1000
for i in range(0, len(o), chunkSize):
with open('C:/Users/abc/Downloads/JSONSplit/xyz_' + str(i//chunkSize) + '.json', 'w') as outfile:
json.dump(o[i:i+chunkSize], outfile)

I modified to this one

@james-see
Copy link
Author

nice thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment