Last active
May 5, 2022 11:19
-
-
Save klmr/23ed79f973c75b11b0b5 to your computer and use it in GitHub Desktop.
Gun ownership vs gun deaths reanalysis
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
author: "<a href=\"http://twitter.com/klmr\">@klmr</a>" | |
title: "Gun deaths and gun ownership in the USA" | |
date: 2015-06-19 | |
output: | |
html_document: | |
theme: readable | |
--- | |
```{r echo=FALSE} | |
library(dplyr, warn.conflicts = FALSE, quietly = TRUE) | |
library(ggplot2, warn.conflicts = FALSE, quietly = TRUE) | |
library(ggthemes, warn.conflicts = FALSE, quietly = TRUE) | |
fte = theme_fivethirtyeight() %+replace% | |
theme(axis.title = element_text(), | |
axis.title.y = element_text(angle = 90), | |
rect = element_rect(fill = 'white', colour = 'white', size = 0.5, linetype = 1), | |
panel.margin.x = NULL, | |
panel.margin.y = NULL) | |
theme_set(fte) | |
scale_colour_discrete = scale_colour_fivethirtyeight | |
scale_fill_discrete = scale_fill_fivethirtyeight | |
library(XML, warn.conflicts = FALSE, quietly = TRUE) | |
options(stringsAsFactors = FALSE) | |
``` | |
On Twitter, somebody claimed that | |
<blockquote class="twitter-tweet" lang="en"><p lang="en" dir="ltr">guess what, gun ownership correlates with gun deaths in the US <a href="https://t.co/PmOMEOnMy1">https://t.co/PmOMEOnMy1</a> <a href="http://t.co/LYFtpHGeNy">pic.twitter.com/LYFtpHGeNy</a></p>— Arthur Charpentier (@freakonometrics) <a href="https://twitter.com/freakonometrics/status/611818369931722752">June 19, 2015</a></blockquote> | |
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> | |
But looking at the data, I wasn’t convinced: | |
<blockquote class="twitter-tweet" data-conversation="none" lang="en"><p lang="en" dir="ltr"><a href="https://twitter.com/freakonometrics">@freakonometrics</a> <a href="https://twitter.com/neilhall_uk">@neilhall_uk</a> Uhm, no. Clearly two clusters, and next to no correlation within each cluster.</p>— Konrad Rudolph (@klmr) <a href="https://twitter.com/klmr/status/611826272881233920">June 19, 2015</a></blockquote> | |
I decided to actually look at the data and verify that the “clusters” with Gun | |
ownership < 20% and > 20%, respectively, were indeed internally uncorrelated. | |
## Plotting the data | |
As a first step, I retrieve the data and put it into a usable format. The first | |
table answers the question, | |
> Are any firearms now kept in or around your home? Include those kept in a | |
> garage, outdoor storage area, car, truck, or other motor vehicle? | |
```{r} | |
gun_ownership_url = 'http://www.washingtonpost.com/wp-srv/health/interactives/guns/ownership.html' | |
gun_ownership = readHTMLTable(gun_ownership_url, header = TRUE, which = 1) | |
gun_ownership = gun_ownership[-1, ] | |
parse_num = function (x) as.numeric(sub(',', '', x)) | |
gun_ownership = select(gun_ownership, State = 1, Total = 2, Yes = 3, | |
`Yes %` = 4, No = 5, `No %` = 6) %>% | |
mutate_each(funs(parse_num), -State) | |
head(gun_ownership) | |
``` | |
The “\*” annotation next to a state name indicates that the state possesses | |
child access prevention laws, according to the website. Let’s annotate this | |
properly. | |
```{r} | |
gun_ownership = gun_ownership %>% | |
mutate(`Child access prevention` = grepl('\\*$', State), | |
State = sub('\\*$', '', State)) | |
``` | |
DC is called “The District”. Let’s fix that. | |
```{r} | |
gun_ownership[gun_ownership$State == 'The District', 'State'] = 'District of Columbia' | |
``` | |
Next, here’s the firearms deaths data. | |
```{r} | |
# Website actively prevents scraping, but allows downloading data. | |
#gun_deaths_url = 'http://kff.org/other/state-indicator/firearms-death-rate-per-100000/' | |
#gun_deaths = readHTMLTable(gun_deaths_url) | |
# Use manually downloaded CSV output. | |
gun_deaths = read.csv('raw_data.csv', skip = 3) %>% | |
select(State = 1, `Deaths per 100000` = 2) | |
head(gun_deaths) | |
``` | |
Now merge the data. | |
```{r} | |
data = inner_join(gun_ownership, gun_deaths, by = 'State') %>% | |
select(State, | |
`Gun ownership (%)` = `Yes %`, | |
`Child access prevention`, | |
`Deaths per 100000`) | |
head(data) | |
``` | |
Plot the data to verify that it’s roughly the same as that published. | |
```{r} | |
plot = ggplot(data, aes(x = `Gun ownership (%)`, y = `Deaths per 100000`)) + | |
geom_point() | |
plot | |
``` | |
The general shape is certainly correct. A few states however have very | |
different positions. Let’s add state names. | |
```{r} | |
state_names_url = 'http://www.infoplease.com/ipa/A0110468.html' | |
state_names = readHTMLTable(state_names_url, header = TRUE, which = 1) %>% | |
{.[-1, ]} %>% | |
select(State = 1, Code = 3) | |
state_names[state_names$State == 'Dist. of Columbia', 'State'] = 'District of Columbia' | |
data = inner_join(data, state_names, by = 'State') | |
plot = plot %+% data | |
plot + geom_text(aes(x = `Gun ownership (%)` + 2, label = Code), size = 3) | |
``` | |
This shows the correspondence better. The only clearly moved state is “Arizona” | |
(AZ). Let’s look at the correlation now. | |
```{r} | |
model = `Deaths per 100000` ~ `Gun ownership (%)` | |
fit = lm(model, data) | |
p = function (fit) format.pval(anova(fit)$`Pr(>F)`[1], digits = 10) | |
plot + geom_smooth(method = lm) + | |
annotate('text', x = 40, y = 3, | |
label = sprintf('R^2 == %0.2f', summary(fit)$r.squared), parse = TRUE) | |
``` | |
So, a clear correlation at $P$ = `r p(fit)` significance. | |
## Clusters | |
Now let’s examine these clusters. I’m using a gun ownership cut-off of 20% | |
because that’s the clusters I initially perceived: | |
```{r} | |
plot + aes(color = `Gun ownership (%)` < 20) | |
``` | |
Here I’ll first look at their correlation and then try to explain the clusters. | |
```{r} | |
data = data %>% | |
mutate(`Bottom left` = `Gun ownership (%)` < 20) | |
bottom_left_fit = lm(model, filter(data, `Bottom left`)) | |
top_right_fit = lm(model, filter(data, ! `Bottom left`)) | |
``` | |
These correlate much less clearly (bottom left: $R^2$ = | |
`r summary(bottom_left_fit)$r.squared`, top right: $R^2$ = | |
`r summary(top_right_fit)$r.squared`) but the correlation in | |
the top right cluster is still significant: $P$ = `r p(top_right_fit)` (the | |
bottom left cluster isn’t, $P$ = `r p(bottom_left_fit)`). | |
Let’s look at the clusters on a map: | |
```{r fig.width=10} | |
us_map = map_data('state') | |
map = ggplot(data) + | |
geom_map(aes(map_id = tolower(State)), map = us_map) + | |
expand_limits(x = us_map$long, y = us_map$lat) + | |
coord_fixed() | |
map + aes(fill = `Gun ownership (%)` < 20) | |
``` | |
… all states in the Northeast. In fact, if you don’t like guns, or don’t want | |
to get murdered, simply don’t live in the Midwest: | |
```{r fig.width=10} | |
map + aes(fill = `Gun ownership (%)`) | |
map + aes(fill = `Deaths per 100000`) | |
``` | |
## Child access prevention laws | |
Finally let’s look at the states with child access prevention laws. | |
```{r} | |
plot + geom_point(aes(color = `Child access prevention`)) | |
``` | |
… unsurprisingly, those are the states with low gun ownership, and fewer | |
deaths. Unfortunately this once again doesn’t tell us whether these laws | |
actually contribute to the lower number of guns (the exact opposite is in fact | |
likely: states with a previously lower number of guns per capita have less | |
incentive to oppose gun control laws, because they are less vested in gun | |
rights). | |
## What does this tell us? | |
Not much. The correlation between gun possession and gun deaths is real, and | |
even when excluding the exceptionally sane Northeast states, the correlation | |
persists (contrary to what I had assumed). What remains unclear is the cause. | |
Whenever a similar plot gets posted online, the implicit claim is that there’s | |
a *causal* relationship. But the correlation shows nothing of the sort. | |
A more comprehensive look at the evidence can be found on [Skeptics Stack | |
Exchange](http://skeptics.stackexchange.com/q/976/82). The discussion there | |
explains some of the problems with merely looking at such correlations. The | |
studies linked there show one thing very clearly: **there is no consensus**. | |
And if there’s any effect at all, [it will be small in comparison to other | |
factors such as poverty level | |
etc.](http://www.jstor.org/stable/3487348?seq=1#page_scan_tab_contents) | |
This isn’t to say that (much) stricter gun control laws aren’t desirable, nor | |
that they wouldn’t be effective. However, they are only a very small part of | |
the story, and cross-state correlation is not good evidence to show their | |
effectiveness. | |
In the end, I think a more compelling argument in favour of gun control laws | |
is the following consideration: would you rather live in a society where | |
citizens (up to 60%!) place so little trust in the system that they fully | |
expect to have to defend themselves using extreme violence, or would you | |
rather live in a society that is peaceful and civilised so that citizens can | |
live with the expectation that risks to their lives are negligible and they do | |
not constantly have to fear bodily harm? For me, the answer is clear. | |
--- | |
[Source as a Gist](https://gist.github.com/23ed79f973c75b11b0b5) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Can I use this script for my product page of CCI 500 primers for sale which is hosted on WordPress?