Last active
September 30, 2016 20:20
-
-
Save cavedave/21c6beff2d371e9df323c292b1dc3afa to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Basketball Size" | |
output: html_notebook | |
gif is up at http://imgur.com/dobmMWM | |
--- | |
This is a copy of this NFL visualisation.<http://noahveltman.com/nflplayers/> | |
I couldnt find the R code to recreate it. I could find data for basketball at <https://github.com/simonwarchol/NBA-Height-Weight>. | |
First get Simon Warchols data and stitch his csvs together | |
```{python} | |
#Stick together all the csvs and add in the year. | |
import os | |
import csv | |
import sys # imports the sys module | |
#file to concatinate all the csv of basketball data together | |
directory = os.path.join("basketball/simonwarchol-NBA-Height-Weight-7871d8b/CSVs/Yearly") | |
ofile = open('ttest.csv', "wt") | |
writer = csv.writer(ofile, delimiter=' ', quotechar='"', quoting=csv.QUOTE_ALL) | |
writer.writerow(["Name","Height","HeightFI","Weight","Year"]) | |
for root,dirs,files in os.walk(directory): | |
for file in files: | |
if file.endswith(".csv"): | |
f=open(directory+"/"+file, 'r') | |
ifile = open(directory+"/"+file, "rt") | |
reader = csv.reader(ifile) | |
rownum = 0 | |
result ="" | |
for row in reader: | |
if rownum == 0: | |
header = row | |
else: | |
result = row + [os.path.splitext(file)[0]] | |
writer.writerow(result) | |
rownum += 1 | |
ifile.close() | |
ofile.close() | |
``` | |
now load data into R | |
```{r} | |
mydata = read.csv("ttest.csv", header=TRUE, sep="\t") | |
head(mydata) | |
``` | |
should look like | |
head(mydata) | |
Name Height HeightFI Weight Year | |
1 Don Anielak 79 6-7 190 1955 | |
2 Paul Arizin 76 6-4 190 1955 | |
Now we need only the height, weight and year. Weight rounded into 10lbs buckets. | |
```{r} | |
library(dplyr) | |
mydata <- dplyr::select(mydata, Weight, Height, Year) | |
mydata$Weight2 <- as.integer(round((mydata$Weight-4)/10)*10) | |
sizePer <- mydata%>% | |
group_by(Weight2, Height, Year)%>% | |
mutate(countT = n())%>% | |
group_by(Year)%>% | |
mutate(countY = n())%>% | |
mutate(per = (countT/countY)*100) | |
sizePer$bin <- cut(sizePer$per, breaks=c(-1:4,Inf),labels=c(as.character(0:4),'5+')) | |
``` | |
Now make a picture of this | |
```{r} | |
library(ggplot2) | |
library(dplyr) | |
library(animation) | |
saveGIF({ | |
for(i in 1955:2014){ | |
print(ggplot(sizeb %>% filter(Year == i), | |
aes(x=Weight2, y=Height,fill=bin)) + | |
geom_tile(color="white", size=0.1)+ | |
theme_bw()+ | |
theme(legend.position="top", plot.title = element_text(size=30, face="bold"))+ | |
coord_cartesian(xlim = c(130,330), ylim = c(63,91)) + | |
scale_fill_manual("%",values = c("#fee5d9","#fcbba1","#fc9272","#fb6a4a","#de2d26","#a50f15"),drop=FALSE)+ | |
annotate(x=320, y=63, geom="text", label=i, size = 9) + | |
annotate(x=130, y=30, geom="text", label="@iamreddave", size = 3) + | |
ylab("Height Inches") + # Remove x-axis label | |
xlab("Weight (lbs)")+ | |
scale_x_continuous(breaks = seq(130,330, by=20)) + | |
scale_y_continuous(breaks = seq(63,91, by=1))+ | |
ggtitle("NBA players: Height and Weight over time") | |
)} | |
}, interval=0.25,ani.width = 900, ani.height = 600) | |
``` | |
I couldn't find year,height, weight data for premier league football, rugby, sumo <http://fivethirtyeight.com/features/the-sumo-matchup-centuries-in-the-making/> or baby height weight data over the years. If you can or find a similar cool dataset please let me know. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment