Skip to content

Instantly share code, notes, and snippets.

@ashander
Last active January 27, 2017 07:41
Show Gist options
  • Save ashander/407997acb06eb68fceb3cb68fc9de438 to your computer and use it in GitHub Desktop.
Save ashander/407997acb06eb68fceb3cb68fc9de438 to your computer and use it in GitHub Desktop.
Equity distribution (from Survey of Consumer Finances)

Replicating Matt Bruenig's analysis. To repeat the analysis below run make in a terminal, or download and unzip the .dta file from the url in the Makefile, then open README.Rmd in Rstudio and run render it.

draw <- as_data_frame(read_stata('rscfp2013.dta'))

set.seed(111)
d <- sample_n(draw, nrow(draw) * 10, replace=TRUE, weight=wgt)

breaks_d <- 0:10 * 0.1
d %>% select(equity) %>%
    mutate(raw_quantile=cume_dist(equity), total_equity=sum(equity)) %>%
    mutate(`people who own more equity than __% of the population`=cut(raw_quantile,
                include.lower=TRUE,
                breaks=breaks_d,
                labels=paste0(100 * breaks_d[-length(breaks_d)], "%"))) %>%
    group_by(`people who own more equity than __% of the population`) %>%
        summarize(`percentage owned (of all equity)`=
          round(sum(equity) / unique(total_equity) * 100, 2))

## # A tibble: 5 x 2
##   people who own more equity than __% of the population percentage owned (of all equity)
##                                                  <fctr>                            <dbl>
## 1                                                   50%                             0.10
## 2                                                   60%                             0.79
## 3                                                   70%                             2.82
## 4                                                   80%                             9.63
## 5                                                   90%                            86.65

How do things look by age?

# from: https://www.federalreserve.gov/econresdata/scf/files/bulletin.macro.txt
#*   age of the household head, and categorical variable:
#    1:<35, 2:35-44, 3:45-54, 4:55-64, 5:65-74, 6:>=75;
#    AGE=X14;
#        AGECL=1+(AGE GE 35)+(AGE GE 45)+(AGE GE 55)+(AGE GE 65)+(AGE GE 75);
d$age_cl <- recode_factor(factor(d$agecl), `1`="<35",`2`="35-44",
          `3`="45-54", `4`="55-64", `5`="65-74", `6`=">=75")
d_eqd_age <-  d %>% select(equity, age_cl) %>%
    mutate(raw_quantile=cume_dist(equity), total_equity=sum(equity)) %>%
    mutate(`people who own more equity than __% of the population`=cut(raw_quantile,
                include.lower=TRUE,
                breaks=breaks_d,
                labels=paste0(100 * breaks_d[-length(breaks_d)], "%"))) %>%
    group_by(age_cl, `people who own more equity than __% of the population`) %>%
        summarize(`percentage owned (of all equity)`=
          round(sum(equity) / unique(total_equity) * 100, 2))
ggplot(d_eqd_age) +
    geom_bar(aes(`people who own more equity than __% of the population`,
             `percentage owned (of all equity)`, fill=age_cl),
         stat='identity') +
    ylim(0, 100) +
    guides(fill=guide_legend("Age")) +
    labs(
         title='Who owns the stock market?',
         caption='Data source: https://www.federalreserve.gov/econresdata/scf/\nMethods: https://gist.github.com/ashander/407997acb06eb68fceb3cb68fc9de438\nBuilding on: https://medium.com/@MattBruenig/who-gains-from-dow-20-000-ba07555e5f12'
         )

Distribution of equity owned by each 5%-ile bin broken into age classes.

How do things look with 5-percentile bins?

breaks_v <- c(0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, .35, .4,
          .45, .5, .55, .6, .65, .7, .75,  .8, .85,  .9, .95, 1)
d_eqv <- d %>% select(equity, age_cl) %>%
    mutate(raw_quantile=cume_dist(equity), total_equity=sum(equity)) %>%
    mutate(`people who own more equity than __% of the population`=cut(raw_quantile,
                include.lower=TRUE,
                breaks=breaks_v,
                labels=paste0(100 * breaks_v[-length(breaks_v)], "%"))) %>%
    group_by(`people who own more equity than __% of the population`, age_cl) %>%
        summarize(`percentage owned (of all equity)`=
          round(sum(equity) / unique(total_equity) * 100, 2))

ggplot(d_eqv) +
    geom_bar(aes(`people who own more equity than __% of the population`,
             `percentage owned (of all equity)`, fill=age_cl),
         stat='identity') +
    ylim(0, 100) +
    guides(fill=guide_legend("Age")) +
    labs(
         title='Who owns the stock market?',
         caption='Data source: https://www.federalreserve.gov/econresdata/scf/\nMethods: https://gist.github.com/ashander/407997acb06eb68fceb3cb68fc9de438\nBuilding on: https://medium.com/@MattBruenig/who-gains-from-dow-20-000-ba07555e5f12'
         )

Distribution of equity owned by each 5%-ile bin.

---
title: "Equity ownership distribution"
output: md_document
---
```{r setup, echo=FALSE}
suppressMessages(library(haven))
suppressMessages(library(dplyr))
suppressMessages(library(ggplot2))
options(width=100)
```
Replicating [Matt Bruenig's analysis](https://medium.com/@MattBruenig/who-gains-from-dow-20-000-ba07555e5f12).
To repeat the analysis below run `make` in a terminal, or download and unzip
the .dta file from the url in the Makefile, then open README.Rmd in Rstudio and
run render it.
```{r data, cache=TRUE}
draw <- as_data_frame(read_stata('rscfp2013.dta'))
```
```{r reweight, cache=TRUE}
set.seed(111)
d <- sample_n(draw, nrow(draw) * 10, replace=TRUE, weight=wgt)
```
My result agrees with Bruenig's (up to some error introduced by resampling the data based on the weights).
```{r deciles}
breaks_d <- 0:10 * 0.1
d %>% select(equity) %>%
mutate(raw_quantile=cume_dist(equity), total_equity=sum(equity)) %>%
mutate(`people who own more equity than __% of the population`=cut(raw_quantile,
include.lower=TRUE,
breaks=breaks_d,
labels=paste0(100 * breaks_d[-length(breaks_d)], "%"))) %>%
group_by(`people who own more equity than __% of the population`) %>%
summarize(`percentage owned (of all equity)`=
round(sum(equity) / unique(total_equity) * 100, 2))
```
How do things look by age?
```{r equity-deciles-by-age, fig.cap="Distribution of equity owned by each 5%-ile bin broken into age classes."}
# from: https://www.federalreserve.gov/econresdata/scf/files/bulletin.macro.txt
#* age of the household head, and categorical variable:
# 1:<35, 2:35-44, 3:45-54, 4:55-64, 5:65-74, 6:>=75;
# AGE=X14;
# AGECL=1+(AGE GE 35)+(AGE GE 45)+(AGE GE 55)+(AGE GE 65)+(AGE GE 75);
d$age_cl <- recode_factor(factor(d$agecl), `1`="<35",`2`="35-44",
`3`="45-54", `4`="55-64", `5`="65-74", `6`=">=75")
d_eqd_age <- d %>% select(equity, age_cl) %>%
mutate(raw_quantile=cume_dist(equity), total_equity=sum(equity)) %>%
mutate(`people who own more equity than __% of the population`=cut(raw_quantile,
include.lower=TRUE,
breaks=breaks_d,
labels=paste0(100 * breaks_d[-length(breaks_d)], "%"))) %>%
group_by(age_cl, `people who own more equity than __% of the population`) %>%
summarize(`percentage owned (of all equity)`=
round(sum(equity) / unique(total_equity) * 100, 2))
ggplot(d_eqd_age) +
geom_bar(aes(`people who own more equity than __% of the population`,
`percentage owned (of all equity)`, fill=age_cl),
stat='identity') +
ylim(0, 100) +
guides(fill=guide_legend("Age")) +
labs(
title='Who owns the stock market?',
caption='Data source: https://www.federalreserve.gov/econresdata/scf/\nMethods: https://gist.github.com/ashander/407997acb06eb68fceb3cb68fc9de438\nBuilding on: https://medium.com/@MattBruenig/who-gains-from-dow-20-000-ba07555e5f12'
)
```
How do things look with 5-percentile bins?
```{r equity-vigesimals, fig.cap="Distribution of equity owned by each 5%-ile bin."}
breaks_v <- c(0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, .35, .4,
.45, .5, .55, .6, .65, .7, .75, .8, .85, .9, .95, 1)
d_eqv <- d %>% select(equity, age_cl) %>%
mutate(raw_quantile=cume_dist(equity), total_equity=sum(equity)) %>%
mutate(`people who own more equity than __% of the population`=cut(raw_quantile,
include.lower=TRUE,
breaks=breaks_v,
labels=paste0(100 * breaks_v[-length(breaks_v)], "%"))) %>%
group_by(`people who own more equity than __% of the population`, age_cl) %>%
summarize(`percentage owned (of all equity)`=
round(sum(equity) / unique(total_equity) * 100, 2))
ggplot(d_eqv) +
geom_bar(aes(`people who own more equity than __% of the population`,
`percentage owned (of all equity)`, fill=age_cl),
stat='identity') +
ylim(0, 100) +
guides(fill=guide_legend("Age")) +
labs(
title='Who owns the stock market?',
caption='Data source: https://www.federalreserve.gov/econresdata/scf/\nMethods: https://gist.github.com/ashander/407997acb06eb68fceb3cb68fc9de438\nBuilding on: https://medium.com/@MattBruenig/who-gains-from-dow-20-000-ba07555e5f12'
)
```
README.md: README.Rmd
Rscript -e 'rmarkdown::render("README.Rmd")'
cp README_files/figure-markdown_strict/equity-vigesimals-1.png equity-vigesimals-1.png
cp README_files/figure-markdown_strict/equity-deciles-by-age-1.png equity-deciles-by-age-1.png
rscfp2013.dta: scfp2013s.zip
unzip scfp2013s.zip
scfp2013s.zip:
wget https://www.federalreserve.gov/econresdata/scf/files/scfp2013s.zip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment