Created
March 11, 2015 14:58
-
-
Save cevaris/a911fb8823a153331203 to your computer and use it in GitHub Desktop.
Naive sampling of file for bash/zsh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Sample file | |
samplef() { | |
# set -x | |
if [ -z "${1}" ]; then | |
echo 'Error: Missing file path' | |
echo | |
echo 'Usage:' | |
echo 'samplef <FILEPATH> <SAMPLE_SIZE>' | |
echo | |
echo 'Optional paramters: <SAMPLE_SIZE>, default is 0.1' | |
echo 'ex; samplef ./myfile.txt 0.25' | |
return | |
fi | |
SAMPLE_RATIO=${2:-0.1} | |
cat $1 | perl -n -e "print if (rand() < $SAMPLE_RATIO)" | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Just drop in your
~/.bashrc
or~/.zshrc
.To dump random 25% of your file
samplef ./myfile.txt 0.25
Defaults to dumping10 percent of your file
samplef ./myfile.txt
Credit goes to http://stackoverflow.com/a/692317/3538289