Skip to content

Instantly share code, notes, and snippets.

@drcjar
Last active December 24, 2015 00:19
Show Gist options
  • Save drcjar/6716180 to your computer and use it in GitHub Desktop.
Save drcjar/6716180 to your computer and use it in GitHub Desktop.
Paracetamol Choropleth

UPDATE: pretty sure I've not done my dataframe preparation right... should probably aggregate to individual practices in the prescribing data then map to CCGs then aggregate practices to CCGs.. and do more sanity checks. Think have fixed for iron analysis... will revisit shortly.

So on this american life this week they suggested that there might be more liver failure occuring due to paracetamol than we realise... I thought it'd be fun to see if there is an association between paracetamol usage and liver deaths per 10,000 population by CCG.

Let's practice Choropleth makage again by making a Paracetamol Choropleth using open data, ogr2ogr, folium and pandas (and indirectly leaflet.js, d3.js, GeoJSON, open street maps and moar) and talk about using gist and blocks. I will skip steps from last time.

  1. get some prescribing data from here www.hscic.gov.uk/gpprescribingdata e.g

    wget 'http://datagov.ic.nhs.uk/presentation/2013_04_April/T201304PDPI+BNFT.csv'
    
  2. we can re-use the ccg boundaries GeoJSON we made last time which is here.

  3. so now all we need to do is prepare our data (the prescribing data) such that it has a column that matches our CCG Codes in the GeoJSON and a column with our thing of interest. I've decided our thing of interest is the number of items prescribed containing paracetamol per 10,000 population by CCG so we want a dataframe with that and our CCG Code as headings.

  4. a bit of footwork is required to prepare the dataset because our prescribing data has individual GP practices so we need to map them to CCGs. We also need to add some CCG population size data into the mix in order to the per 10,000 bit.

    wget 'http://www.connectingforhealth.nhs.uk/systemsandservices/data/ods/ccginterim/interimpcmem_v5.zip'
    wget 'https://indicators.ic.nhs.uk/download/Clinical%20Commissioning%20Group%20Indicators/Data/CCG_registered_patients_2012.csv'
    

The above .csv file from connecting for health tells us which practice belongs to which CCG without too much trouble. The above .csv file of 'CCG_registered_patients_2012' is helpfully broken down into age ranges which we don't care about for our choropleth making hackery so we roll them up. All of this is done using python pandas (see ipython notebook part one).

  1. make the choropleth using folium magic python code (see ipython notebook part two)

    map = folium.Map(location=[54.2, -2.45], zoom_start=5)
    map.geo_json(geo_path=ccg_geo, data_out='data10.json', data=para_analysis,
       columns=['CCG13CD', 'Per_person_para_by_ccg'],
       key_on='feature.properties.CCG13CD',
       threshold_scale=[5, 6, 7, 8, 9, 10]
       fill_color='PuBu', fill_opacity=0.7, line_opacity=0.3,
       legend_name='Number of paracetamol items prescribed in April per 10,000 population by CCG')
    map.create_map(path='map_10.html')
  2. kinda need some data on liver deaths now...

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View raw

(Sorry about that, but we can’t show files that are this big right now.)

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View raw

(Sorry about that, but we can’t show files that are this big right now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment