Skip to content

Instantly share code, notes, and snippets.

@NorikDavtian
Created November 20, 2017 10:08
Show Gist options
  • Save NorikDavtian/2e41dfcfb9c09434b1d491309914af9e to your computer and use it in GitHub Desktop.
Save NorikDavtian/2e41dfcfb9c09434b1d491309914af9e to your computer and use it in GitHub Desktop.
Small handy utility script to break large jsonl files to smaller json files.
// COUNT=3 INPUT=large_file.jsonl node jsonl-to-multiple-json-files.js
const fs = require('fs');
const readline = require('readline');
const COUNT = parseInt(process.env.COUNT); // Numer of lines to get from the large file
const input = fs.createReadStream(process.env.INPUT.toString()); // input .jsonl file
const rl = readline.createInterface({ input });
let counter = 0;
rl.on('line', (line) => {
rl.pause();
if (counter >= COUNT) {
return rl.close();
}
counter++;
const output = fs.openSync(`./output/file_${counter}.json`, 'w');
fs.write(output, JSON.stringify(JSON.parse(line), null, 2));
rl.resume();
}).on('close', () => {
console.log(`${counter} lines were processed.`);
process.exit(0);
}).on('error', (error) => {
console.error(error);
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment