Skip to content

Instantly share code, notes, and snippets.

@rafapereirabr
Created July 17, 2020 12:03
Show Gist options
  • Save rafapereirabr/0822a2a6f8e6e40b7b7004518a022945 to your computer and use it in GitHub Desktop.
Save rafapereirabr/0822a2a6f8e6e40b7b7004518a022945 to your computer and use it in GitHub Desktop.
Build balanced panel data set
```
################# 4) Build a Balanced Panel data set ------------------------------
# data with one observation for each area each day, even if there was no notification on that day/area
# get all dates and munis
all_munis <- unique(df$code_muni)
all_dates <- seq(min(df$DT_NOTIFIC),
max(df$DT_NOTIFIC),
by = "day")
# all possible combinations
length(all_munis) * length(all_dates)
base <- expand.grid(all_munis, all_dates) %>% as.data.table()
names(base) <- c('code_muni', 'DT_NOTIFIC')
# merge
panel <- left_join( base, df, by=c("code_muni", "DT_NOTIFIC")) %>% setDT()
# replace missing with 0 in selected colums [only do this for columns related to numer of cases and deaths]
cols = c("sari_cases", "covid19_cases", "deaths")
panel[ , (cols) := lapply(.SD, nafill, fill=0), .SDcols = cols]
# replace other NA's with last value by muni
# data.table::setnafill(d, type = "locf") # currently only working with numeric
panel <- panel %>%
group_by(code_muni) %>%
tidyr::fill( names(panel), .direction = "downup")
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment