Skip to content

Instantly share code, notes, and snippets.

@mschnetzer
Last active December 31, 2019 10:33
Show Gist options
  • Save mschnetzer/07fc5a970e87cd5f1a5ee3ba24023286 to your computer and use it in GitHub Desktop.
Save mschnetzer/07fc5a970e87cd5f1a5ee3ba24023286 to your computer and use it in GitHub Desktop.
Current and estimated position in the net wealth distribution (https://twitter.com/matschnetzer/status/1090954828631732224)
library(tidyverse)
library(survey)
library(msthemes)
library(gganimate)
# Load HFCS and Non-core data
load("hfcs2014AT.rda")
load("Noncore AT/non-core-at.rda")
# Calculate estimated decile with mean from multiple imputation data
estdec <- hnc %>% group_by(sa0010) %>% summarise(estdec = round(mean(aha0500),0))
# Calculate weighted decile breaks for net wealth (dn3001) from multiple imputation data
breaks <- with(hfcs2014hat,svyquantile(~dn3001,quantile=seq(0,1,0.1),na.rm=T))
dfbreaks <- do.call(rbind.data.frame, breaks) %>% mutate(`0`=min(`0`)-1,`1`=max(`1`)+1)
groups <- dfbreaks %>% summarise_all(mean)
hfcs2014hat <- update(hfcs2014hat, wgroup=cut(dn3001,groups,labels=1:10))
# RBind 5 imputations
hfcs2014 <- bind_rows(hfcs2014hat[[1]][[1]]$variables,hfcs2014hat[[1]][[2]]$variables,hfcs2014hat[[1]][[3]]$variables,hfcs2014hat[[1]][[4]]$variables,hfcs2014hat[[1]][[5]]$variables)
# Assign households to wealth decile
realdec <- hfcs2014 %>% group_by(sa0010) %>% summarise(realdec = round(mean(as.numeric(wgroup)),0), weight=mean(hw0010))
findat <- left_join(realdec,estdec)
sumhh <- findat %>% summarise(sum(weight)) %>% pull()
# Reshape dataset for plotting
plotdat <- findat %>% spread(estdec, weight) %>% group_by(realdec) %>% summarise_at(vars(-realdec), funs(sum(.,na.rm=T))) %>% mutate_at(vars(-realdec),funs(./sumhh*100)) %>% select(-sa0010)
plotdat <- plotdat %>% mutate_at(vars(-realdec),funs(round(.*10,0))) %>% gather(estdec,obs,`1`:`10`)
plotdat <- plotdat[rep(row.names(plotdat), plotdat$obs), 1:2]
plotdat$id = 1:nrow(plotdat)
plotdat$color = factor(plotdat$realdec)
plotdat <- plotdat %>% gather(state,decile,estdec:realdec)
plotdat$decile <- as.numeric(plotdat$decile)
plotdat$state <- as.numeric(factor(plotdat$state, levels = c("realdec","estdec")))
plotdat$count = rnorm(nrow(plotdat),1,0.03)
labdf <- tibble(state=c(1,2),
label=c("Actual position","Estimated position"))
plot <- plotdat %>% ggplot(aes(x=decile,y=count,color=color)) +
geom_point(position = position_jitter(width = 0.2, height = 0.2), size=2) +
scale_color_viridis_d() +
scale_x_continuous(breaks=1:10, minor_breaks = seq(0.5,10.5,1)) +
theme_ms() + theme(legend.position = "none", axis.text.y = element_blank(), axis.title.x = element_text(hjust=0), panel.grid.minor.x = element_line(size=0.3,color="gray"), panel.grid.major = element_blank(), plot.caption = element_text(size=10)) +
labs(x=expression("Net wealth decile (poor" %->% "rich)"), y="", title="Rich people think they are middle class", subtitle ="Actual and estimated position in the net wealth distribution in Austria", caption="Data: HFCS 2014, OeNB. Figure: @matschnetzer") +
geom_label(aes(x=10,y=1.3,label=label,hjust=0.8),color=msc_palette[1],data=labdf, size=4) +
transition_states(state,
transition_length = 15,
state_length = 20)
anim <- animate(plot, height=4, width=8, nframes = 100, res=300, unit = "in")
anim_save("hfcs.gif", anim)
@mschnetzer
Copy link
Author

Unfortunately, HFCS data is not as easily publicly available on a webpage as most datasets in the US are. HFCS is a survey coordinated by the European Central Bank and conducted by the national central banks. The dataset is only available for all Eurozone countries together (data application: https://www.ecb.europa.eu/pub/economic-research/research-networks/html/researcher_hfcn.en.html). However, there are national non-core datasets available at the national central banks (https://www.hfcs.at/en/datennutzung.html) including the question on self-positioning for Austria.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment