in1 <- c(TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
x[in1]```
* single [] operator returns the same class - except in matrix. To get the same behavior in matrix, set ``` drop=FALSE``` attribute
* ```x[1,,drop=FALSE] ```
* ```attributes(y) ``` to list all attributes of a data strucuture
* ``` attr(y, "class")``` to print out one particular attribute
* lists also work as hashes ```x <- list(foo = 1:4, bar=0.6) x$foo ```
* multiple elements cannot be extracted using ```[[]] ``` or ```$``` referencing
* if you have to dive inside a list and then its contained list, then you have to use a vector . e.g. ``` x[[c(1,2)]]``` gives 2
* Reading files
* usually, I do ```a <- read.table("specdata/110.csv", comment.char="", nrows=10, header=TRUE,sep=",") ```
* [Help page](http://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html) for ```read.table ```
* Large datasets
* ```version``` command gives useful output
<pre>
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 2
minor 14.1
year 2011
month 12
day 22
svn rev 57956
language R
version.string R version 2.14.1 (2011-12-22)
</pre>
* numeric data are stored in 64 bits
* 1,500,000 rows of 120 columns numeric data takes up **1500000 X 120 X 8 bytes /(2^20) MB **
* dput without file name is a good way of seeing the real (underlying) data structure of R data
e.g. ```y <- data.frame(a=1, b=2, c="a") ``` and then ```dput(y) ```
<pre>
structure(list(a = 1, b = 2, c = structure(1L, .Label = "a", class = "factor")), .Names = c("a",
"b", "c"), row.names = c(NA, -1L), class = "data.frame")
</pre>
* alternatively, you can use ``` str(y)```
* ``` summary(y)``` can also be used
* to read from a url do
* ``` con <- url("http://www.google.com", "r")```
* ``` y <- readLines(con)```
* ``` head(y)``` to get the headers
* functions
*```?sd ``` - gives information about function
* ```args(sd)``` gives info about arguments
* ```formals(sd) ``` gives info about formal parameters
* argument evaluation is lazy
* namespaces and libraries
* ```search()``` gives list of packages already loaded
* ``` library(lattice)``` pushes the *lattice* namespace just after the Global namespace.
* R uses lexical scoping which is particularly useful for statistical computations
* in lexical scoping, variables are picked from where they are **defined**. dynamic scoping picks up variable from where they are **called**
* function + its environment is called *closure*. calling ```environment(f)``` returns the environment. ```parent.env(environment(f))``` goes one step above
* if a function is defined inside another function, then the environment will be something funky like ```<environment: 0x24cc520>```
* ```ls(environment(f))``` gives symbols listed inside an environment
#Graphing
```xyplot(weight ~ Time | Diet, data=BodyWeight)``` = A set of 3 panels showing the relationship between weight and time for each diet.
Created
September 26, 2012 10:01
-
-
Save sandys/3787127 to your computer and use it in GitHub Desktop.
notes for Computing for Data Analysis coursera course
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment