Skip to content

Instantly share code, notes, and snippets.

@lrvick
Created January 7, 2012 09:19
Show Gist options
  • Save lrvick/1574264 to your computer and use it in GitHub Desktop.
Save lrvick/1574264 to your computer and use it in GitHub Desktop.
Bash script to harvest all public cell-phone number -> carrier mappings from whitepages.com
#!/bin/bash
for NPA in {201..989}; do
NXXX=2000
while [ ${NXXX} -lt 9999 ]; do
NXXX1=$[$NXXX+1]
NXXX2=$[$NXXX+2]
NXXX3=$[$NXXX+3]
echo "Checking ${NPA}${NXXX}XXX, ${NPA}${NXXX1}XXX,
${NPA}${NXXX2}XXX, ${NPA}${NXXX3}XXX..."
URL="http://www.whitepages.com/carrier_lookup?carrier=alltel&name;_0=&number;_0=${NPA}${NXXX}000&name;_1=&number;_1=${NPA}${NXXX1}000&name;_2=&number;_2=${NPA}${NXXX2}000&name;_3=&number;_3=${NPA}${NXXX3}000&localtime;=survey"
echo URL : $URL
curlblob=$(curl -s "${URL}")
parsedblob=$( echo $curlblob | sed -e
's/.*\([0-9]\{7\}\).*\(Alltel\|Cingular\|Cellular\|Nextel\|Quest\|Sprint\|Suncom\|T-Mobile\|Verizon\|Virgin\|No
Results\).*\([0-9]\{7\}\).*\(Alltel\|Cingular\|Cellular\|Nextel\|Quest\|Sprint\|Suncom\|T-Mobile\|Verizon\|Virgin\|No
Results\).*\([0-9]\{7\}\).*\(Alltel\|Cingular\|Cellular\|Nextel\|Quest\|Sprint\|Suncom\|T-Mobile\|Verizon\|Virgini\|No
Results\).*\([0-9]\{7\}\).*\(Alltel\|Cingular\|Cellular\|Nextel\|Quest\|Sprint\|Suncom\|T-Mobile\|Verizon\|Virgin\|No
Results\).*/\1,\2|\3,\4|\5,\6|\7,\8/g' -e 's/ //g' -e 's/|/ /g')
for each in $parsedblob; do
if [[ $each != *html* && $each != *DOCTYPE* &&
$each != *href* && $each != *=* ]]; then
if [[ $each != *NoResults* ]]; then
echo Result: `echo ${NPA}$each | sed
's/000,/,/g'`
echo `echo ${NPA}$each | sed
's/000,/,/g'` >> carrierdb.csv
fi
else
echo Unable to parse: "$URL"
fi
done
echo ""
NXXX=$[$NXXX+4]
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment