Skip to content

Instantly share code, notes, and snippets.

View hadley's full-sized avatar

Hadley Wickham hadley

View GitHub Profile
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hadley
hadley / beer.R
Last active September 10, 2015 16:49
bottles_of_beer <- function(i = 99) {
message("There are ", i, " bottles of beer on the wall, ", i, " bottles of beer.")
while(i > 0) {
tryCatch(
Sys.sleep(1),
interrupt = function(err) {
i <<- i - 1
if (i > 0) {
message(
"Take one down, pass it around, ", i,
@hadley
hadley / fun.R
Created September 10, 2015 14:52
run_me <- function() {
message("You shouldn't have done that...")
while(TRUE) {
tryCatch(
Sys.sleep(1),
interrupt = function(err) {
message("You can't escape that easily!")
}
)
}

I'm looking for a few technical reviewers for the 2nd edition of the ggplot2 book. Your mission, if you choose to accept it, is to:

  • Closely read the complete ggplot2 book (~270 pages).

  • Let me know if you discover any mistakes, omissions, or parts that are hard to understand.

  • Get back to me by October 1.

Your comments will be easiest to process if either:

@hadley
hadley / na.R
Last active August 29, 2015 14:26
# let x be John's age (but I don't know what it is)
x <- NA
# let y be Mary's age (but I don't know what it is)
y <- NA
# are Mary and John the same age?
x == y
df <- data.frame(
x = 1:5,
y = 1:5,
z = factor(c(1, 2, 3, 3, 3))
)
ggplot(df, aes(x, y)) +
geom_rect(aes(ymin = -Inf, ymax = Inf, xmin = x - 0.5, xmax = x + 0.5, fill = z)) +
geom_point() +
scale_fill_grey()
# http://stackoverflow.com/a/3407254/16632
Rcpp::cppFunction("double ceil_any(double x, double prec) {
if (fabs(prec / x) < DBL_MIN)
return x;
double r = fmod(fabs(x), prec);
if (r == 0)
return x;
return (x > 0) ? x + prec - r : x + r;
data_sort <- function() {
desc <- FALSE;
my_sort <- function (data) {
sort(data, decreasing = desc)
}
attr(my_sort, "desc") <- function(x) {
desc <<- x
return(my_sort)
@hadley
hadley / ds-training.md
Created March 13, 2015 18:49
My advise on what you need to do to become a data scientist...

If you were to give recommendations to your "little brother/sister" on things that they need to do to become a data scientist, what would those things be?

I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:

  • Statistical knowledge
  • Programming/hacking skills
  • Domain expertise

Statistical knowledge

@hadley
hadley / advise.md
Created February 13, 2015 21:32
Advise for teaching an R workshop

I think the two most important messages that people can get from a short course are:

a) the material is important and worthwhile to learn (even if it's challenging), and b) it's possible to learn it!

For those reasons, I usually start by diving as quickly as possible into visualisation. I think it's a bad idea to start by explicitly teaching programming concepts (like data structures), because the pay off isn't obvious. If you start with visualisation, the pay off is really obvious and people are more motivated to push past any initial teething problems. In stat405, I used to start with some very basic templates that got people up and running with scatterplots and histograms - they wouldn't necessary understand the code, but they'd know which bits could be varied for different effects.

Apart from visualisation, I think the two most important topics to cover are tidy data (i.e. http://www.jstatsoft.org/v59/i10/ + tidyr) and data manipulation (dplyr). These are both important for when people go off and apply