I'm working through some puzzles, towards a larger problem.
Created
August 21, 2016 02:52
-
-
Save amandabee/24861427e7f80067ac632ee68b2a6c4a to your computer and use it in GitHub Desktop.
Fun With Census Data
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Grades" | |
author: "Amanda Hickman" | |
date: "August 17, 2016" | |
output: html_document | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE) | |
``` | |
Imagine a simplified version of this exercise. You have one table of student scores on a series of quizzes, and another table that shows the cutoff for each letter grade. And you want to assign students a grade based on the mean of their test scores. | |
```{r} | |
quiz_scores <- read.table(text = "Student Quiz Score | |
James Week_1 86 | |
Noah Week_1 88 | |
Amanda Week_1 77 | |
Zach Week_1 88 | |
Adam Week_1 79 | |
James Week_2 52 | |
Noah Week_2 83 | |
Amanda Week_2 83 | |
Zach Week_2 78 | |
Adam Week_2 78 | |
James Week_3 82 | |
Noah Week_3 91 | |
Amanda Week_3 89 | |
Zach Week_3 99 | |
Adam Week_3 79 | |
James Week_4 97 | |
Noah Week_4 80 | |
Amanda Week_4 81 | |
Zach Week_4 93 | |
Adam Week_4 98 | |
James Week_5 73 | |
Noah Week_5 68 | |
Amanda Week_5 93 | |
Zach Week_5 86 | |
Adam Week_5 91 | |
James Week_6 96 | |
Noah Week_6 87 | |
Amanda Week_6 90 | |
Zach Week_6 81 | |
Adam Week_6 79", header = TRUE) | |
``` | |
You could manually find the mean score for each student, if you wanted to: | |
```{r} | |
mean(quiz_scores[quiz_scores$Student == "Adam", ]$Score) | |
mean(quiz_scores[quiz_scores$Student == "James", ]$Score) | |
mean(quiz_scores[quiz_scores$Student == "Noah", ]$Score) | |
mean(quiz_scores[quiz_scores$Student == "Amanda", ]$Score) | |
mean(quiz_scores[quiz_scores$Student == "Zach", ]$Score) | |
``` | |
Or you can use `tapply` to find the mean for each student in one fell swoop: | |
```{r} | |
grades_by_student <- data.frame(tapply(quiz_scores$Score, quiz_scores$Student, mean)) | |
names(grades_by_student) <- "Mean" | |
``` | |
As a side note, you could also, if you wanted, find the mean for each quiz -- that might be a clue that one of them was too easy, or too hard. | |
```{r} | |
tapply(quiz_scores$Score, quiz_scores$Quiz, mean) | |
``` | |
To assign letter grades to those mean quiz scores, we need a table of grade thresholds. I use the following: | |
```{r} | |
letters <- read.table(text = "Grade Threshold | |
F 0 | |
D- 60 | |
D 63.33 | |
D+ 66.67 | |
C- 70 | |
C 73.33 | |
C+ 76.67 | |
B- 80 | |
B 83.33 | |
B+ 86.67 | |
A- 90 | |
A 93.33 | |
", header = TRUE) | |
# Make sure the letter grades are actually ordered | |
letters <-letters[order(letters$Threshold, decreasing = TRUE), ] | |
``` | |
And then I can use `cuts` to divide the students into grade buckets: | |
```{r} | |
# We need one extra cut break, at the top. | |
letter <- cut(grades_by_student$Mean, | |
breaks = c(letters$Threshold, max(grades_by_student$Mean) + 1), | |
labels = rev(letters$Grade), | |
right = FALSE) | |
# And then we can glue it together. | |
grades_by_student <- cbind(grades_by_student, letter) | |
``` | |
And, just for kicks, I wanted to test `cuts()` on a larger data set so I also tried applying it to the whole `quiz_scores` data frame: | |
```{r} | |
# We need one extra cut break, at the top. | |
letter <- cut(quiz_scores$Score, | |
breaks = c(letters$Threshold, max(quiz_scores$Score) + 1), | |
labels = rev(letters$Grade), | |
right = FALSE) | |
# And then we can glue it together. | |
quiz_scores <- cbind(quiz_scores, letter) | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment