Skip to content

Instantly share code, notes, and snippets.

View bayesball's full-sized avatar

Jim Albert bayesball

View GitHub Profile
@bayesball
bayesball / WSGame7.csv
Created April 6, 2017 16:18
Win probability data for game 7 of 2016 World Series
Pitcher Player Inn. Outs Base Score Play LI RE WE WPA RE24
C Kluber D Fowler 1 0 ___ 0-1 Dexter Fowler homered (Fly). 0.87 0.48 39.80% 0.102 1
C Kluber K Schwarber 1 0 ___ 0-1 Kyle Schwarber singled to shortstop (Grounder). 0.79 0.48 36.70% 0.032 0.37
C Kluber K Bryant 1 0 1__ 0-1 Kris Bryant flied out to right (Fly). 1.3 0.85 39.60% -0.029 -0.35
C Kluber A Rizzo 1 1 1__ 0-1 Anthony Rizzo flied out to center (Fliner (Fly)). 1.05 0.5 42.10% -0.025 -0.28
C Kluber K Schwarber 1 2 1__ 0-1 Kyle Schwarber advanced on a stolen base to 2B. 0.72 0.22 41.20% 0.009 0.09
C Kluber B Zobrist 1 2 _2_ 0-1 Ben Zobrist flied out to right (Fly). 1.04 0.31 44.10% -0.029 -0.31
K Hendricks C Santana 1 0 ___ 0-1 Carlos Santana flied out to right (Fliner (Liner)). 0.92 0.48 41.70% -0.023 -0.22
K Hendricks J Kipnis 1 1 ___ 0-1 Jason Kipnis struck out swinging. 0.65 0.25 40.10% -0.016 -0.15
K Hendricks F Lindor 1 2 ___ 0-1 Francisco Lindor reached on error to second (Grounder). Error by Javier Baez. 0.42 0.1 41.40% 0.013 0.12
@bayesball
bayesball / daily_component_averages.R
Last active June 7, 2017 16:07
R script to read daily batting data from Sports Illustrated website and compute final season AVG predictions using the component method
# This loads the function to compute the component prediction of the batting average
library(devtools)
source_gist("6562e43b7c412ec2031d", filename="component average functions.R")
# Here is a short function to collect the batting data for the current day from the Sports Illustrated website.
collect_data <- function(){
require(htmltab)
require(dplyr)
d1 <- htmltab("https://www.si.com/mlb/stats", which=1)
d1 <- mutate(d1,
@bayesball
bayesball / all_work.R
Created July 5, 2017 00:41
Prediction of 2nd Half Team Records
# illustration of producing graph for 2015 season
# I am assuming that the Retrosheet game log files are in the folder
# ~/Google Drive/gamelogs/gamelogs/
# output <- all_work(2015)
# output$p
all_work <- function(season){
require(readr)
require(lubridate)
require(dplyr)
@bayesball
bayesball / launchvelocity.R
Created August 7, 2017 22:13
Launch velocity and pitch location study
# Load in useful packages
library(baseballr)
library(dplyr)
library(ggplot2)
# Read in the Statcast cata
b <- scrape_statcast_savant_batter_all("2017-08-04",
"2017-08-06")
library(baseballr)
library(dplyr)
library(BayesTestStreak)
library(BApredict)
library(TeachBayes)
# read in batter log data for Stanton for 2017 season
playerid_lookup("Stanton")
stanton <- scrape_statcast_savant_batter("2017-03-25",
"2017-08-15", 519317)
@bayesball
bayesball / swingmiss.R
Created October 22, 2017 21:47
R code to explore swing and miss rates of 2017 World Series teams
# Read in pitch-by-pitch data for
# all Dodgers and Astros regulars in 2017 season
# data is downloaded into csv files that I read into R
library(tidyverse)
d1 <- read_csv("astros.csv")
d3 <- read_csv("dodgers.csv")
d1$Team <- "Astros"
d3$Team <- "Dodgers"
d13 <- rbind(d1, d3)
@bayesball
bayesball / duration_study.R
Created November 12, 2017 03:02
Exploring game durations
# load some packages
library(dplyr)
library(Lahman)
library(ggplot2)
library(readr)
# add a theme for the ggplot title
TH <- theme(plot.title = element_text(colour = "blue",
@bayesball
bayesball / statcast_gam.R
Created November 20, 2017 11:59
R code to fit generalized additive model to Statcast data
# load in packages
library(readr)
library(dplyr)
library(ggplot2)
library(mgcv)
##### read in a theme for the title of my plots
TH <- theme(plot.title = element_text(hjust = 0.5, size = 18))
@bayesball
bayesball / christy.R
Created November 27, 2017 14:25
Text mining words from book by Christy Mathewson
# load in some packages
library(tidytext)
library(tidyverse)
library(gutenbergr)
library(wordcloud)
library(Lahman)
# load in Christy book
@bayesball
bayesball / statcast_measure.R
Created January 1, 2018 18:26
R script for Using Statcast to Measure HItters post
# load in tidyverse package and
# load in theme for title
library(tidyverse)
TH <- theme(plot.title = element_text(colour = "blue",
size = 18,
hjust = 0.5, vjust = 0.8, angle = 0))
# read in the 2017 statcast data