Last active
October 28, 2015 15:11
-
-
Save raphaelrk/bee57fa0731ad7de101a to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cs50 R seminar by Connor Harris notes | |
more info: | |
https://cran.r-project.org/doc/manuals/R-intro.pdf | |
http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ | |
Types: numeric(float), character(string), logicalbool), coercion(scanf()) | |
Vectors (1d array), matrices, high-dim arrays of above types | |
Lists: associate array. Vecs of lists bevahve oddle | |
No real pure atomic types. Single values are arrays of length one | |
No mixed-type arrays- otherwise it'll all become a string | |
Weak typing, no variable decs | |
assign with <- | |
comment with # | |
%% modular division | |
%/% integer division | |
Range with colon (2:5 = [2 3 4 5]) | |
One-indexing | |
For loops: for(value in vector) { ... } | |
Function: foo <- function(args) { ... } | |
Vectors | |
constructed with c(datum_1, ..., datum_n) | |
args can be vectors, but array is flattened | |
cannot be mixed type | |
behave as if padded infinitely with value NA | |
unary functions map over arrays | |
binary functions applied entry by entry | |
access with square brackets containing one-indexed indices | |
Can pass vector of indices | |
summary() | |
Matrix | |
matrix(data, nrow=rows, ncol=columns), data is a vector, fills first up->down then left->right | |
multiple a %*% b | |
spectral decomp eigen(a) | |
initialize array(dim(dim_1),...,dim(dim_n)) | |
access row(rownum) col(colnum)? | |
list | |
list(key=val,...,key=val) | |
access/set vals with foo$key | |
access individual key-val pairs with foo["key"] | |
nonexistent keys return NULL | |
Data frame | |
subclass of list | |
every value is a vector of the same length | |
use for representing data table | |
data.fram([column-name=]col1data...) | |
Functions | |
foo <- function(arg[=default],...,arg[=default]) | |
Called as foo([arg1=]val1,...,[argn=]valn) | |
Don't need return: last statement by default | |
Args in function call don't need to be in specific order | |
Data import and export | |
read.table() | |
read.xls() | |
read.csv() | |
Multilinear regress | |
model <- lm(y ~ x1[+x2[...[+xn]...]][, dataframe]) | |
t dep, x's indep and can be vecs or colum heads of data frame in second arg | |
model <- lm(y^2 + 1 ~ log(x)) | |
summaries | |
cor(vec1, vec2[, method=method]) for correletions | |
Plotting | |
plot(x, y, ...) | |
takes to vecs of smae length | |
precede with attach(dataframe) to use column headers instead of separate vectors | |
other args | |
"p" for points | |
"l" for lines | |
main | |
xlab, ylab labels | |
col color | |
best-fit lines and local reg curves abline(regression-model) lines(lowess(xx, y)) | |
png(filename) | |
FFmpeg ImageMagick for animation | |
/* spent time using R, copied to the bottom of this doc */ | |
Foreign function interface | |
for R to call C functions | |
C function must take all args as pointers | |
For arrays this is a pointer to the first elem | |
floating-point type is double | |
void dotprod(double* vec1, double* vec2, double* out) | |
*out = dotprod_internal(*vec1..) | |
R CMD SHLIB foo.c | |
dyn.load("foo.so") | |
type coercion: as.type | |
returns list (assoc array) of param names and modded vals | |
result <- .C("dotprod", as.double(vec1), as.double(vec2), as.integer(length(vec1)), as.double(0)) | |
product <- result$out | |
don't use explicit loops. use map, reduce, find, filter | |
reduce: pass two-param func. applies it to first two elems, then with that result and next elem.. | |
can be used for sum | |
don't append to vecs: | |
v[length(vec)+1] <- newvalue | |
vec <- c(vec, newvalue) | |
bad because reallocation is super slow | |
pre-allocate vecs to necessary size | |
vec <- vector(length=1000) | |
Error Handling | |
easy mistakes: vector vals where single nums expected, and NULL values- funcs behave strangley, don't throw clean errors | |
Sanity-check: stopifnot(), like C's assert() | |
/**********/ | |
/* R-time */ | |
/**********/ | |
R version 3.2.2 (2015-08-14) -- "Fire Safety" | |
Copyright (C) 2015 The R Foundation for Statistical Computing | |
Platform: x86_64-apple-darwin13.4.0 (64-bit) | |
R is free software and comes with ABSOLUTELY NO WARRANTY. | |
You are welcome to redistribute it under certain conditions. | |
Type 'license()' or 'licence()' for distribution details. | |
Natural language support but running in an English locale | |
R is a collaborative project with many contributors. | |
Type 'contributors()' for more information and | |
'citation()' on how to cite R or R packages in publications. | |
Type 'demo()' for some demos, 'help()' for on-line help, or | |
'help.start()' for an HTML browser interface to help. | |
Type 'q()' to quit R. | |
> a <- c(1,2,4) | |
> a | |
[1] 1 2 4 | |
> 2:200 | |
[1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
[19] 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | |
[37] 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
[55] 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
[73] 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | |
[91] 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | |
[109] 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
[127] 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
[145] 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | |
[163] 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | |
[181] 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
[199] 200 | |
> a | |
[1] 1 2 4 | |
> a+1 | |
[1] 2 3 5 | |
> b <- c(20,40,80 | |
+ ) | |
> a+b | |
[1] 21 42 84 | |
> c <- 6*1:10 | |
> c | |
[1] 6 12 18 24 30 36 42 48 54 60 | |
> c <- 10*c | |
> a + c | |
[1] 61 122 184 241 302 364 421 482 544 601 | |
Warning message: | |
In a + c : longer object length is not a multiple of shorter object length | |
> c | |
[1] 60 120 180 240 300 360 420 480 540 600 | |
> summary(c) | |
Min. 1st Qu. Median Mean 3rd Qu. Max. | |
60 195 330 330 465 600 | |
> m <- matrix(log(1:9) nrow=3, ncol=3) | |
Error: unexpected symbol in "m <- matrix(log(1:9) nrow" | |
> m <- matrix(log(1:9), nrow=3, ncol=3) | |
> m | |
[,1] [,2] [,3] | |
[1,] 0.0000000 1.386294 1.945910 | |
[2,] 0.6931472 1.609438 2.079442 | |
[3,] 1.0986123 1.791759 2.197225 | |
> m/log(1) | |
[,1] [,2] [,3] | |
[1,] NaN Inf Inf | |
[2,] Inf Inf Inf | |
[3,] Inf Inf Inf | |
> m/log(10) | |
[,1] [,2] [,3] | |
[1,] 0.0000000 0.6020600 0.8450980 | |
[2,] 0.3010300 0.6989700 0.9030900 | |
[3,] 0.4771213 0.7781513 0.9542425 | |
> n -< matrix(2:4, nrow=3, ncol=4) | |
Error: unexpected '<' in "n -<" | |
> n <- matrix(2:4, nrow=3, ncol=4) | |
> n | |
[,1] [,2] [,3] [,4] | |
[1,] 2 2 2 2 | |
[2,] 3 3 3 3 | |
[3,] 4 4 4 4 | |
> m %*% n | |
[,1] [,2] [,3] [,4] | |
[1,] 11.94252 11.94252 11.94252 11.94252 | |
[2,] 14.53237 14.53237 14.53237 14.53237 | |
[3,] 16.36140 16.36140 16.36140 16.36140 | |
> eigen(m) | |
$values | |
[1] 4.533528795 -0.717104893 -0.009761412 | |
$vectors | |
[,1] [,2] [,3] | |
[1,] -0.4643796 -0.91571234 0.1622601 | |
[2,] -0.5837295 -0.07927824 -0.8040451 | |
[3,] -0.6660417 0.39393637 0.5719993 | |
> vec <- eigen(m$vectors[,3] | |
+ ) | |
Error in m$vectors : $ operator is invalid for atomic vectors | |
> vec <- eigen(m)$vectors[,3] | |
> vec | |
[1] 0.1622601 -0.8040451 0.5719993 | |
> | |
> m %*% vec | |
[,1] | |
[1,] -0.001583887 | |
[2,] 0.007848615 | |
[3,] -0.005583521 | |
> eigen(m)$values[3] * vec | |
[1] -0.001583887 0.007848615 -0.005583521 | |
> func <- function(a,b)(a^2 + b) | |
> func(5,1) | |
[1] 26 | |
> func(b=1,a=5) | |
[1] 26 | |
> func <- function(a,b=2)(a^2 + b) | |
> func(4) | |
[1] 18 | |
> func(4, 1) | |
[1] 17 | |
> mtcars | |
mpg cyl disp hp drat wt qsec vs am gear carb | |
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 | |
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 | |
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 | |
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 | |
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 | |
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 | |
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 | |
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 | |
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 | |
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 | |
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 | |
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 | |
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 | |
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 | |
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 | |
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 | |
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 | |
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 | |
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 | |
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 | |
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 | |
Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 | |
AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 | |
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 | |
Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 | |
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 | |
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 | |
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 | |
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 | |
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 | |
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 | |
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 | |
> head(mtcars) | |
mpg cyl disp hp drat wt qsec vs am gear carb | |
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 | |
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 | |
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 | |
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 | |
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 | |
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 | |
> rownames(mtcars) | |
[1] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" | |
[4] "Hornet 4 Drive" "Hornet Sportabout" "Valiant" | |
[7] "Duster 360" "Merc 240D" "Merc 230" | |
[10] "Merc 280" "Merc 280C" "Merc 450SE" | |
[13] "Merc 450SL" "Merc 450SLC" "Cadillac Fleetwood" | |
[16] "Lincoln Continental" "Chrysler Imperial" "Fiat 128" | |
[19] "Honda Civic" "Toyota Corolla" "Toyota Corona" | |
[22] "Dodge Challenger" "AMC Javelin" "Camaro Z28" | |
[25] "Pontiac Firebird" "Fiat X1-9" "Porsche 914-2" | |
[28] "Lotus Europa" "Ford Pantera L" "Ferrari Dino" | |
[31] "Maserati Bora" "Volvo 142E" | |
> colnames(mtcars) | |
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" | |
[11] "carb" | |
> mtcars[2] | |
cyl | |
Mazda RX4 6 | |
Mazda RX4 Wag 6 | |
Datsun 710 4 | |
Hornet 4 Drive 6 | |
Hornet Sportabout 8 | |
Valiant 6 | |
Duster 360 8 | |
Merc 240D 4 | |
Merc 230 4 | |
Merc 280 6 | |
Merc 280C 6 | |
Merc 450SE 8 | |
Merc 450SL 8 | |
Merc 450SLC 8 | |
Cadillac Fleetwood 8 | |
Lincoln Continental 8 | |
Chrysler Imperial 8 | |
Fiat 128 4 | |
Honda Civic 4 | |
Toyota Corolla 4 | |
Toyota Corona 4 | |
Dodge Challenger 8 | |
AMC Javelin 8 | |
Camaro Z28 8 | |
Pontiac Firebird 8 | |
Fiat X1-9 4 | |
Porsche 914-2 4 | |
Lotus Europa 4 | |
Ford Pantera L 8 | |
Ferrari Dino 6 | |
Maserati Bora 8 | |
Volvo 142E 4 | |
> mtcars[,2] | |
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4 | |
> mtcars[2,] | |
mpg cyl disp hp drat wt qsec vs am gear carb | |
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4 | |
> c <- 2*1:10 | |
> c | |
[1] 2 4 6 8 10 12 14 16 18 20 | |
> c[2] | |
[1] 4 | |
> vec <- c | |
> vec[c(4, 5, 7)] | |
[1] 8 10 14 | |
> vec[-4] | |
[1] 2 4 6 10 12 14 16 18 20 | |
> vec[2:6] | |
[1] 4 6 8 10 12 | |
> attach(mtcars) | |
> model <- lm(mpg ~ wt, mtcars) | |
> summary(model) | |
Call: | |
lm(formula = mpg ~ wt, data = mtcars) | |
Residuals: | |
Min 1Q Median 3Q Max | |
-4.5432 -2.3647 -0.1252 1.4096 6.8727 | |
Coefficients: | |
Estimate Std. Error t value Pr(>|t|) | |
(Intercept) 37.2851 1.8776 19.858 < 2e-16 *** | |
wt -5.3445 0.5591 -9.559 1.29e-10 *** | |
--- | |
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 | |
Residual standard error: 3.046 on 30 degrees of freedom | |
Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446 | |
F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10 | |
> model <- lm(mpg ~ log(wt), mtcars) | |
> model <- lm(mpg ~ wt, mtcars) | |
> model2 <- lm(mpg ~ log(wt), mtcars) | |
> summary(model2) | |
Call: | |
lm(formula = mpg ~ log(wt), data = mtcars) | |
Residuals: | |
Min 1Q Median 3Q Max | |
-3.7440 -2.0954 -0.3672 1.0709 6.6150 | |
Coefficients: | |
Estimate Std. Error t value Pr(>|t|) | |
(Intercept) 39.257 1.758 22.32 < 2e-16 *** | |
log(wt) -17.086 1.510 -11.31 2.39e-12 *** | |
--- | |
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 | |
Residual standard error: 2.669 on 30 degrees of freedom | |
Multiple R-squared: 0.8101, Adjusted R-squared: 0.8038 | |
F-statistic: 128 on 1 and 30 DF, p-value: 2.391e-12 | |
> model3 <- lm(mpg ~ log(wt) + qsec, mtcars) | |
> summary(model3) | |
Call: | |
lm(formula = mpg ~ log(wt) + qsec, data = mtcars) | |
Residuals: | |
Min 1Q Median 3Q Max | |
-4.0729 -1.3876 -0.4368 0.7493 5.4694 | |
Coefficients: | |
Estimate Std. Error t value Pr(>|t|) | |
(Intercept) 22.2967 4.4603 4.999 2.54e-05 *** | |
log(wt) -16.1783 1.2519 -12.923 1.47e-13 *** | |
qsec 0.8932 0.2224 4.016 0.000384 *** | |
--- | |
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 | |
Residual standard error: 2.177 on 29 degrees of freedom | |
Multiple R-squared: 0.878, Adjusted R-squared: 0.8696 | |
F-statistic: 104.3 on 2 and 29 DF, p-value: 5.661e-14 | |
> plot(wt, mpg) | |
> help("plot") | |
> plot(wt, mpg, type="p") | |
> help("plot") | |
> plot(wt, mpg, type="p") | |
> plot(wt, mpg, main="Vehical fuel efficiency versus weight", ylab="miler per gallon", type="p") | |
> plot(wt, mpg, main="Vehical fuel efficiency versus weight", ylab="miler per gallon", xlab="Weight in tons", type="p") | |
> model <- lm(mpg ~ wt, mtcars) | |
> abline(model) | |
Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : | |
plot.new has not been called yet | |
> abline(model) | |
Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : | |
plot.new has not been called yet | |
> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment