Skip to content

Instantly share code, notes, and snippets.

@robertz
Created January 18, 2019 18:39
Show Gist options
  • Save robertz/cd7b8a5a1c76032be68593d81d243c75 to your computer and use it in GitHub Desktop.
Save robertz/cd7b8a5a1c76032be68593d81d243c75 to your computer and use it in GitHub Desktop.
#!/bin/bash
script='csvinator.sh'
# There date param and country should always be passed in
if [ $# -lt 2 ];
then
echo Usage ${script} rundate country
exit 1;
fi
rundate="$1"
rm -f invoices.txt output.csv
touch invoices.txt
touch output.csv
# The expected header
header="Batch ID,Vendor ID,Vendor Name,Vendor Class,Address 1,Address 2,City,State,Postal Code,Country,Phone,Payment Amount"
echo "$header" > output.csv
# Get the directory list from the web server, filter for files with our run date
lynx -dump -listonly http://site.com/folder/containing/documents/ | grep http | grep "$rundate" | awk '{print $2}' >> invoices.txt
while read url
do
# save curl response to a variable
curld=$(curl -s $url 2>&1)
# strip out the header
output="${curld//$header/}"
# if the invoice country matches
if grep -iq ",$2," <<< $output; then
echo $url
# write the csv data (minus the header) to the output file
grep -i ",$2," <<< $output >> output.csv
fi
done < invoices.txt
echo Done.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment