Skip to content

Instantly share code, notes, and snippets.

@hadley
Created February 13, 2015 21:32
Show Gist options
  • Save hadley/37c8078eb9d46b5dac7e to your computer and use it in GitHub Desktop.
Save hadley/37c8078eb9d46b5dac7e to your computer and use it in GitHub Desktop.
Advise for teaching an R workshop

I think the two most important messages that people can get from a short course are:

a) the material is important and worthwhile to learn (even if it's challenging), and b) it's possible to learn it!

For those reasons, I usually start by diving as quickly as possible into visualisation. I think it's a bad idea to start by explicitly teaching programming concepts (like data structures), because the pay off isn't obvious. If you start with visualisation, the pay off is really obvious and people are more motivated to push past any initial teething problems. In stat405, I used to start with some very basic templates that got people up and running with scatterplots and histograms - they wouldn't necessary understand the code, but they'd know which bits could be varied for different effects.

Apart from visualisation, I think the two most important topics to cover are tidy data (i.e. http://www.jstatsoft.org/v59/i10/ + tidyr) and data manipulation (dplyr). These are both important for when people go off and apply what they've learned to their own data. If they understand a bit about tidy data, they'll have some idea how to convert their own data in to a form that's easy to work with in R. Data manipulation is similarly useful, and it's a good place to revise some statistical basics, like if you're aggregating with means, you also need to record the number of observations in each group so you have some way to calibrate variability. (There are some good online tutorials for both tidyr and dplyr - I don't remember the exact urls, but you should be able to find them with a bit of googling.)

I think you should aim for a 50:50 split between lecturing and hands-on activities. It will feel hard to give up 50% of the material that you could cover, but without hands-on practice, attendees will retain only a tiny amount of the material. The other goal of hands-on activities is for people to rapidly fail in an environment where they can quickly get help - if they have some practice recovering from errors in a supportive environment, they should be more resilient when learning on their own.

Finally, The first 15 minutes of the class are really important for setting the tone. I recommend that you avoid a detailed overview of the material to be covered, and instead dive in to introductions (because the most helpful person is often the one sitting next to you) and ice breakers. One technique that I've found extremely powerful is to give people a relatively simple challenge with a tight time limit: "With you neighbour, try and recall the four types of atomic vector. You have one minute starting now!" (I use that in a programming focussed class; you'd need to come up with something similar for data analysis). This gets people energised and engaged with their neighbours and results in a much more interactive and fun class.

@ctb
Copy link

ctb commented Feb 16, 2015

Completely agree with all your points, based on my own experience with courses! (well, except for the teaching R part, but I'm guessing you're going to be hard to convince on that one :)

@lifan0127
Copy link

Thanks for the insight! You mentioned good tutorials on tidyr and dplyr but not remembering URLs. It just so happens I created a dplyr tutorial list for a recent meetup talk: https://github.com/lifan0127/meetup_dplyr_talk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment