Skip to content

Instantly share code, notes, and snippets.

@Btibert3
Last active December 8, 2015 22:09
Show Gist options
  • Save Btibert3/8f3671af30c8d1afc555 to your computer and use it in GitHub Desktop.
Save Btibert3/8f3671af30c8d1afc555 to your computer and use it in GitHub Desktop.
Testing stattleship, weird error

About

Updated the queryAPI and created a new function, stattle that wraps it and makes walking easier. jsonlite is great, but I started to bump into some issues with dplyr::bind_rows. Right now these two functions play nice together, but needs to be further tested.

Load/Install

## source the functions
u = "https://gist.githubusercontent.com/Btibert3/8f3671af30c8d1afc555/raw/4d620b99cd6c8df3aa17a6c84ac07056a5d479f3/queryAPI.r"
devtools::source_url(u)
u = "https://gist.githubusercontent.com/Btibert3/8f3671af30c8d1afc555/raw/4d620b99cd6c8df3aa17a6c84ac07056a5d479f3/stattle.r"
devtools::source_url(u)

Requirements

We will handle these better shortly, but the code requires:

  • dplyr
  • httr
  • jsonlite

Basic Usage

nhl_teams = stattle(TOKEN, ep='teams', walk=T)
length(nhl_teams)
names(nhl_teams[[1]])

The nhl_teams object returns a list of lists. In the example above, there were two hits to the API, so length(nhl_teams) is 2. Nested below each of these parent lists are the data.

Collapse the team data into a dataframe.

dat = data.frame()
for (i in 1:length(nhl_teams)) {
  tmp_dat = nhl_teams[[i]]$teams
  dat = bind_rows(dat, tmp_dat)
  rm(tmp_dat)
}

TODO

  • put this into a more formal package
  • leverage way better version control
#' Interface with the Stattleship API
#'
#' A simple, generic function to query data from the API
#'
#' @param token character. A valid token for the API
#' @param sport character. The sport, such as hockey, basketball, football
#' @param league character. NHL, NBA, etc.
#' @param ep character. The endpoint
#' @param query A list that defines the query parameters
#' @param version The API version. Current version is 1.
#' @param walk logical. NOT YET IMPLEMENTED
#' @param page numeric. The page number to request
#' @param verbose logical. For debugging, returns response and parsed response.
#'
#' @examples
#' \dontrun{
#' TOKEN = "aklsjdlfkajsfdas"
#' results = queryAPI(TOKEN,
#' sport="hockey",
#' query=list(player_id="akjasdf")
#' version = 1,
#' ep = "stats",
#' verbose = F)
#'
#' @export
queryAPI = function(token,
sport="hockey",
league = "nhl",
ep="stats",
query=list(),
version=1,
walk=F,
page=NA,
verbose=F) {
## TODO: walk the results if > 20 entries
## TODO: test to validate data types
## TODO: best practices on walking the data? Have # entries paramter?
## packages : doesnt feel like this is the right way to do it
library(httr)
## build the URL and the endpoint
URL = sprintf("https://www.stattleship.com/%s/%s/%s", sport, league, ep)
## the accept parameters. Is there a better way to do this?
ACCEPT = sprintf("application/vnd.stattleship.com; version=%d", version)
## if page is supplied, add it to the list
if (!is.na(page) & is.numeric(page) & page >= 1) {
query = c(query, page=page)
}
## test the body to see if it is a list and has values
## if not, just return an empty list
## todo: test to ensure that query is a list if !is.na
## get the request from the API
resp = GET(URL,
add_headers(Authorization =TOKEN,
Accept = ACCEPT,
`Content-Type`="application/json"),
query=query)
## walk the content if true
## convert response to text first, do not use baseline httr::content default
api_response = content(resp, as="text")
## use jsonlite::fromJSON
api_response = jsonlite::fromJSON(api_response)
## if verbose = T, return a list that includes the parsed results
## and the original request
if (verbose) {
api_response = list(response = resp,
api_json = api_response)
}
## return the data
return(api_response)
}
stattle = function(token,
sport="hockey",
league = "nhl",
ep="stats",
query=list(),
version=1,
walk=F,
page=NA,
verbose=F) {
## if na, set page to 1 for consistency
if (is.na(page)) page = 1
## if page is supplied, add it to the list
if (!is.na(page) & is.numeric(page) & page >= 1) {
query = c(query, page=page)
}
## get the first request
tmp = queryAPI(TOKEN, sport, league, ep, query, verbose=T)
## simple alert
if (tmp$response$status_code != 200) {
message("API response was something other than 200")
}
## create the response list
response = list()
## set the original parsed response to the first element
response[[1]] = tmp$api_json
## if walk, parse here and send into respose[[i]]
## NOT FINISHED -- below is under dev
if (walk) {
## check to see if paging is necessary
total_results = as.numeric(tmp$response$headers$total)
rpp = as.numeric(tmp$response$headers$`per-page`)
pages = ceiling(total_results / rpp)
## the first page was already retrievedd, only care 2+
if (pages >= 2) {
for (p in 2:pages) {
p_query = list(page=p)
tmp_p = queryAPI(TOKEN, sport, league, ep, query=p_query, verbose=T)
## check to make sure 200
if (tmp$response$status_code != 200) {
message("the pages>2 loop requested a page that was not 200")
}
## add as an element into the response container
response[[p]] = tmp_p$api_json
}
}#endif(pages)
}#endif(walk)
## return the list of data results
## list of lists
return(response)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment