Skip to content

Instantly share code, notes, and snippets.

@neilkod
Created December 7, 2011 14:01
Show Gist options
  • Save neilkod/1442908 to your computer and use it in GitHub Desktop.
Save neilkod/1442908 to your computer and use it in GitHub Desktop.
tee + named pipe to gunzip and count number of lines in a file
attempt 1 - 3gb file
-rw-r--r-- 1 dwsports dwsports 3.0G Aug 25 16:58 flat_customer_dim1.out.gz
# count # of lines
-bash-3.1$ time gunzip -c flat_customer_dim1.out.gz | wc -l
39422185
real 1m53.595s
user 1m44.468s
sys 0m9.184s
# gunzip to /dev/null. the practical application is to gunzip -c and redirect
# into nzload
-bash-3.1$ time gunzip -c flat_customer_dim1.out.gz > /dev/null
real 1m38.200s
user 1m35.792s
sys 0m1.905s
# using @mat_kelcey's named pipes & tee method
-bash-3.1$ mkfifo f
-bash-3.1$ wc -l < f > num_lines &
[1] 1153
-bash-3.1$ time gunzip -c flat_customer_dim1.out.gz | tee f > /dev/null
real 2m1.986s
user 1m38.492s
sys 0m18.119s
-bash-3.1$ cat num_lines
39422185
[1]+ Done wc -l <f >num_lines
-bash-3.1$
# Attempt 2
# 4.6GB file
-rwxrwxrwx 1 dwsports dwsports 4.6G Jul 21 10:16 flat_community_cumulative_week_fct1.out.gz.bak
[dwsports@c17-dw-aux4 done]$ time zcat flat_community_cumulative_week_fct1.out.gz | wc -l
160900327
real 1m46.187s
user 1m31.873s
sys 0m12.424s
[dwsports@c17-dw-aux4 done]$ time zcat flat_community_cumulative_week_fct1.out.gz > /dev/null
real 1m21.762s
user 1m17.293s
sys 0m1.901s
[dwsports@c17-dw-aux4 done]$ mkfifo f
[dwsports@c17-dw-aux4 done]$ wc -l < f > num_lines &
[1] 21118
[dwsports@c17-dw-aux4 done]$ time zcat flat_community_cumulative_week_fct1.out.gz | tee f > /dev/null
real 1m58.651s
user 1m19.909s
sys 0m27.940s
[dwsports@c17-dw-aux4 done]$
[1]+ Done wc -l <f >num_lines
[dwsports@c17-dw-aux4 done]$ cat num_lines
160900327
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment