Devin Pastoor dpastoor

Notes:

I've tried to break up in to separate pieces, but it's not always possible: e.g. knowledge of data structures and subsetting are tidy intertwined.
Level of Bloom's taxonomy listed in square brackets, e.g. http://bit.ly/15gqPEx. Few categories currently assess components higher in the taxonomy.

Programming R curriculum

Data structures

Update: The timings are now updated with runs from R v3.1.0.

A small note on this tweet from @KevinUshey and this tweet from @ChengHLee:

The number of rows, while is important, is only one of the factors that influence the time taken to perform the join. From my benchmarking experience, the two features that I found to influence join speed, especially on hash table based approaches (ex: dplyr), much more are:

The number of unique groups.
The number of columns to perform the join based on - note that this is also related to the previous point as in most cases, more the columns, more the number of unique groups.

That is, these features influence join speed in spite of having the same number of rows.

Create a Meteor app and put the client_/server_ files in a client/server directories. Also, create a public dir to save the uploaded files.

	This post examines the features of [R Markdown](http://www.rstudio.org/docs/authoring/using_markdown)
	using [knitr](http://yihui.name/knitr/) in Rstudio 0.96.
	This combination of tools provides an exciting improvement in usability for
	[reproducible analysis](http://stats.stackexchange.com/a/15006/183).
	Specifically, this post
	(1) discusses getting started with R Markdown and `knitr` in Rstudio 0.96;
	(2) provides a basic example of producing console output and plots using R Markdown;
	(3) highlights several code chunk options such as caching and controlling how input and output is displayed;
	(4) demonstrates use of standard Markdown notation as well as the extended features of formulas and tables; and
	(5) discusses the implications of R Markdown.

	library(shiny)

	shinyServer(function(input,output){

	output$distPlot<-reactivePlot(function(){
	dist<-rnorm(input$obs)
	p<-qplot(dist,binwidth=0.1)+geom_vline(xintercept=mean(dist))+theme_dpi()
	p<-p+coord_cartesian(xlim=c(-4,4))+geom_vline(xintercept=median(dist),color=I("red"))
	print(p)
	})

	#' Simplified loading and installing of packages
	#'
	#' This is a wrapper to \code{\link{require}} and \code{\link{install.packages}}.
	#' Specifically, this will first try to load the package(s) and if not found
	#' it will install then load the packages. Additionally, if the
	#' \code{update=TRUE} parameter is specified it will check the currently
	#' installed package version with what is available on CRAN (or mirror) and
	#' install the newer version.
	#'
	#' @param pkgs a character vector with the names of the packages to load.

	#’ Create a Kaplan-Meier plot using ggplot2
	#’
	#’ @param sfit a \code{\link[survival]{survfit}} object
	#’ @param returns logical: if \code{TRUE}, return an ggplot object
	#’ @param xlabs x-axis label
	#’ @param ylabs y-axis label
	#’ @param ystratalabs The strata labels. \code{Default = levels(summary(sfit)$strata)}
	#’ @param ystrataname The legend name. Default = “Strata”
	#’ @param timeby numeric: control the granularity along the time-axis
	#’ @param main plot title

	library("shiny")
	library("plotly")
	library("ggplot2")

	shinyServer(function(input, output) {
	output$text <- renderText({
	ggiris <- qplot(Petal.Width, Sepal.Length, data=iris, color=Species)
	py <- plotly("RgraphingAPI", "ektgzomjbx")
	res <- py$ggplotly(ggiris)
	iframe <- paste("<iframe height=\"600\" id=\"igraph\" scrolling=\"no\" seamless=\"seamless\" src=\"",

	// Use Gists to store code you would like to remember later on
	console.log(window); // log the "window" object to the console

	import os
	import random
	import string
	import tempfile
	import subprocess

	def random_id(length=8):
	return ''.join(random.sample(string.ascii_letters + string.digits, length))

	TEMPLATE_SERIAL = """