Skip to content

Instantly share code, notes, and snippets.

@thoolihan
Last active August 29, 2015 14:10
Show Gist options
  • Save thoolihan/380d7b65ee98a59aa780 to your computer and use it in GitHub Desktop.
Save thoolihan/380d7b65ee98a59aa780 to your computer and use it in GitHub Desktop.
Average Line in 2013 NFL Games
-- analyze data from http://www.repole.com/sun4cast/data.html
raw_records = LOAD '/data/nfl/nfl2013lines.csv' USING PigStorage(',');
records = STREAM raw_records THROUGH `tail -n +2` AS
(Date, Visitor, VisitorScore:int,
HomeTeam, HomeScore:int, Line:float, TotalLine:float);
all_games = GROUP records ALL;
data = FOREACH all_games
GENERATE AVG(records.VisitorScore) as Visitor, AVG(records.HomeScore) as Home, AVG(records.Line) as Spread;
data2 = FOREACH data
GENERATE data.Home - data.Visitor as diff, data.Visitor, data.Home, data.Spread;
DUMP data2;
--STORE data2 INTO '/output/nfl/avg_line';
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment