Skip to content

Instantly share code, notes, and snippets.

@amandabee
Created August 21, 2016 02:52
Show Gist options
  • Save amandabee/24861427e7f80067ac632ee68b2a6c4a to your computer and use it in GitHub Desktop.
Save amandabee/24861427e7f80067ac632ee68b2a6c4a to your computer and use it in GitHub Desktop.
Fun With Census Data

I'm working through some puzzles, towards a larger problem.

---
title: "Grades"
author: "Amanda Hickman"
date: "August 17, 2016"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Imagine a simplified version of this exercise. You have one table of student scores on a series of quizzes, and another table that shows the cutoff for each letter grade. And you want to assign students a grade based on the mean of their test scores.
```{r}
quiz_scores <- read.table(text = "Student Quiz Score
James Week_1 86
Noah Week_1 88
Amanda Week_1 77
Zach Week_1 88
Adam Week_1 79
James Week_2 52
Noah Week_2 83
Amanda Week_2 83
Zach Week_2 78
Adam Week_2 78
James Week_3 82
Noah Week_3 91
Amanda Week_3 89
Zach Week_3 99
Adam Week_3 79
James Week_4 97
Noah Week_4 80
Amanda Week_4 81
Zach Week_4 93
Adam Week_4 98
James Week_5 73
Noah Week_5 68
Amanda Week_5 93
Zach Week_5 86
Adam Week_5 91
James Week_6 96
Noah Week_6 87
Amanda Week_6 90
Zach Week_6 81
Adam Week_6 79", header = TRUE)
```
You could manually find the mean score for each student, if you wanted to:
```{r}
mean(quiz_scores[quiz_scores$Student == "Adam", ]$Score)
mean(quiz_scores[quiz_scores$Student == "James", ]$Score)
mean(quiz_scores[quiz_scores$Student == "Noah", ]$Score)
mean(quiz_scores[quiz_scores$Student == "Amanda", ]$Score)
mean(quiz_scores[quiz_scores$Student == "Zach", ]$Score)
```
Or you can use `tapply` to find the mean for each student in one fell swoop:
```{r}
grades_by_student <- data.frame(tapply(quiz_scores$Score, quiz_scores$Student, mean))
names(grades_by_student) <- "Mean"
```
As a side note, you could also, if you wanted, find the mean for each quiz -- that might be a clue that one of them was too easy, or too hard.
```{r}
tapply(quiz_scores$Score, quiz_scores$Quiz, mean)
```
To assign letter grades to those mean quiz scores, we need a table of grade thresholds. I use the following:
```{r}
letters <- read.table(text = "Grade Threshold
F 0
D- 60
D 63.33
D+ 66.67
C- 70
C 73.33
C+ 76.67
B- 80
B 83.33
B+ 86.67
A- 90
A 93.33
", header = TRUE)
# Make sure the letter grades are actually ordered
letters <-letters[order(letters$Threshold, decreasing = TRUE), ]
```
And then I can use `cuts` to divide the students into grade buckets:
```{r}
# We need one extra cut break, at the top.
letter <- cut(grades_by_student$Mean,
breaks = c(letters$Threshold, max(grades_by_student$Mean) + 1),
labels = rev(letters$Grade),
right = FALSE)
# And then we can glue it together.
grades_by_student <- cbind(grades_by_student, letter)
```
And, just for kicks, I wanted to test `cuts()` on a larger data set so I also tried applying it to the whole `quiz_scores` data frame:
```{r}
# We need one extra cut break, at the top.
letter <- cut(quiz_scores$Score,
breaks = c(letters$Threshold, max(quiz_scores$Score) + 1),
labels = rev(letters$Grade),
right = FALSE)
# And then we can glue it together.
quiz_scores <- cbind(quiz_scores, letter)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment