Last active
June 10, 2016 22:19
-
-
Save brianboyer/4d1b751c0d4443130f788bb0fe6cec16 to your computer and use it in GitHub Desktop.
Finding the top-rated home care providers for an area, as listed in Medicare's Home Health Compare data
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Because https://www.medicare.gov/homehealthcompare/ does not provide any way to | |
# search for providers by their ratings, we've got to do this the hard way. | |
# Luckily, with csvkit the hard way isn't so hard. Get it here http://csvkit.readthedocs.io/ | |
# | |
# First, download the data from https://data.medicare.gov/data/home-health-compare | |
# | |
curl -o HHCompare_Revised_FlatFiles.zip "https://data.medicare.gov/views/bg9k-emty/files/36e8f3b0-0273-4b46-ba04-89f089678a84?content_type=application%2Fzip%3B%20charset%3Dbinary&filename=HHCompare_Revised_FlatFiles.zip" | |
# | |
# Unzip the files. | |
# | |
unzip HHCompare_Revised_FlatFiles.zip | |
# | |
# And now, step by step... | |
# | |
# First, grab the columns we need. (The column numbers are listed in a PDF file | |
# that came in the ZIP we just downloaded. You could also look up the columns with | |
# csvcut's awesome header names feature: 'csvcut -n HHC_SOCRATA_HHCAHPS_PRVDR.csv') | |
# csvcut -e "latin1" -c 1,3,5,6,16,17 HHC_SOCRATA_HHCAHPS_PRVDR.csv | |
# | |
# Then restrict our area using the first three digits of the zip code. I'm | |
# searching in the area north of Houston, so anything starting with 773 will | |
# do the trick. | |
# | csvgrep -c "Zip" -r "773.*" | |
# | |
# Then remove the providers for which we have no data. | |
# | csvgrep -c "HHCAHPS Survey Summary Star Rating" -m "Not Available" -i | |
# | |
# Finally, sort by rating and write to a file. | |
# | csvsort -c "HHCAHPS Survey Summary Star Rating" -r > ratings.csv | |
# | |
csvcut -e "latin1" -c 1,3,5,6,16,17 HHC_SOCRATA_HHCAHPS_PRVDR.csv | csvgrep -c "Zip" -r "773.*" | csvgrep -c "HHCAHPS Survey Summary Star Rating" -m "Not Available" -i | csvsort -c "HHCAHPS Survey Summary Star Rating" -r > ratings.csv |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment