Skip to content

Instantly share code, notes, and snippets.

@bfontaine
Created April 4, 2019 08:09
Show Gist options
  • Save bfontaine/1900ec1e372be6b3ad9d9f50cac95fc9 to your computer and use it in GitHub Desktop.
Save bfontaine/1900ec1e372be6b3ad9d9f50cac95fc9 to your computer and use it in GitHub Desktop.

Estimate the number of lines in a large file

Get the size of the file:

$ wc -c myfile.jsons
104431233268 myfile.jsons

Then pipe it through head to take only e.g. 1/1000th of the file:

cat myfile.jsons | head -c 104431233 | wc -l
213657

Our file should then have 213657*1000 = 213,657,000 lines.

Of course, it only works if the length of your lines is uniform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment