Skip to content

Instantly share code, notes, and snippets.

@isomorphisms
Last active August 29, 2015 14:05
Show Gist options
  • Save isomorphisms/6c7500900fb4ecf7b839 to your computer and use it in GitHub Desktop.
Save isomorphisms/6c7500900fb4ecf7b839 to your computer and use it in GitHub Desktop.
don't use head and tail … always be sampling from the middle of the data.frame
taste <- function(soup, ladle=5L) sample.int(x=soup, size=ladle, replace=TRUE)
peek <- function(df,n=5L) df[ taste(nrow(df),n) , ] #dataframe[r,c] means "subset of dataframe row #r, column #c"
p <- function(df,n=5L) rbind(head(df,1L), peek(df,n), tail(df,1L))
#SAMPLE OUTPUT
require(bigvis)
data(movies)
dim(movies)
#[1] 130456 14
#so movies is too long to look at.
#yet if I always peek at it with head(movies) I will get bored,
#and be wasting the opportunity to gradually get to know my dataset.
peek(movies)
# title year length budget rating votes mpaa
#109844 Liberty Kid 2007 92 200000 6.2 126 <NA>
#30142 Due mattacchioni al Moulin Rouge 1964 90 NA 3.7 9 <NA>
#87158 Nunca es domingo 2002 19 NA 4.4 16 <NA>
#122636 Turning Point 1977 2009 110 NA 6.7 6 <NA>
#124824 Hooters! 2010 90 62250 7.0 5 <NA>
#
# Action Animation Comedy Drama Documentary Romance Short
#109844 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#30142 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#87158 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#122636 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#124824 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#TODO:
# - contiguous random pieces
# - deal with other shapes like lists
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment