Skip to content

Instantly share code, notes, and snippets.

@ckrapu
Created April 8, 2021 04:27
Show Gist options
  • Save ckrapu/a43ebd6dd4cde2697a12ece357483265 to your computer and use it in GitHub Desktop.
Save ckrapu/a43ebd6dd4cde2697a12ece357483265 to your computer and use it in GitHub Desktop.
Preprocess California ozone measurements from EPA Air Quality Portal
import pandas as pd
# Run this as a notebook so that the bash cell magic ! works
import pandas as pd
! wget https://aqs.epa.gov/aqsweb/airdata/daily_44201_2020.zip
filepath = './daily_44201_2020.csv'
df = pd.read_csv(filepath)
state_code = 6
subset_df = df[df['State Code']==state_code]
subset_df['timestamp'] = pd.DatetimeIndex(subset_df['Date Local'])
subset_df['lat_long'] = subset_df['Latitude'].astype(str) + '/' + subset_df['Longitude'].astype(str)
pivot_df = pd.pivot_table(subset_df[['lat_long', 'Arithmetic Mean','Date Local']],
values='Arithmetic Mean',
index='lat_long', columns='Date Local').T
no_missing = pivot_df.iloc[0:150].dropna(how='any', axis=1)
no_missing.to_csv('./california_ozone_2020.csv')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment