Skip to content

Instantly share code, notes, and snippets.

@mynameisfiber
Last active December 10, 2015 23:48
Show Gist options
  • Select an option

  • Save mynameisfiber/4511770 to your computer and use it in GitHub Desktop.

Select an option

Save mynameisfiber/4511770 to your computer and use it in GitHub Desktop.
Split a gzip'ed newline separated file into multiple files by line count.
#!/bin/bash
file="$1";
numlines="$2"
basefile=${file%.gz}
isMore=1
function write_n_lines {
local c=0;
while [[ "$c" -lt "$1" ]]; do
if ! read LINE; then
return 0;
fi;
echo "$LINE";
c=$(( $c + 1 ));
done;
return 1;
}
gunzip -c "$file" | pv -rbt | (
fn=0;
while [[ $isMore -eq 1 ]]; do
newfile=$( printf "%s-%04d.gz" "$basefile" $fn );
write_n_lines $numlines | pigz -c > $newfile
isMore=${PIPESTATUS[0]}
fn=$(( $fn + 1 ))
done;
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment