Skip to content

Instantly share code, notes, and snippets.

@mwfrost
Created July 26, 2011 13:30
Show Gist options
  • Select an option

  • Save mwfrost/1106764 to your computer and use it in GitHub Desktop.

Select an option

Save mwfrost/1106764 to your computer and use it in GitHub Desktop.
Graph runs per player with share of team's hits
library(ggplot2)
library(plyr)
library(reshape)
# Batting.csv from http://baseball1.com/files/database/lahman58-csv.zip
bb <- read.csv("Batting.csv")
bb <-bb[order(bb$yearID, bb$teamID, bb$R),]
bb <- ddply(bb, c('teamID','yearID'), transform, team_hits_total = sum(H), team_hits_below=cumsum(H)-H)
bb$hit_percentile_floor <- bb$team_hits_below / bb$team_hits_total
bb$hit_percentile_roof<- (bb$team_hits_below + bb$H ) / bb$team_hits_total
# Sample graphs of a single year
# Weight class
p <- ggplot(subset(bb, yearID == 2005), aes(R, hit_percentile_floor))
p + geom_segment(arrow=arrow(length=unit(0.1,"cm")), aes(yend=hit_percentile_roof,xend=R)) + scale_y_continuous('Player Percentile Rank and Share of Team Hits',formatter = "percent")+ scale_x_continuous('Runs') + facet_wrap(~teamID)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment