-
rbindlistfromdata.tableis very efficient in binding multiple rows of a data frame. -
According to here, the most efficient way to remove a column is to
library("data.table") # set from data.table set(my_df, j = "A", value = NULL)
-
-
Save NobodyXu/39479d5a226d55417f11515df16ec419 to your computer and use it in GitHub Desktop.
-
Semantics:
-
assignment:
- When assigning a variable to another name, eg,
a = b, a new object is created. However, no data is copied due to the copy-on-modify
- When assigning a variable to another name, eg,
-
In order to
xorbooleans, usexor(a, b). -
reminder and quotient
%%for reminder and%/%for quotient.
-
For accessing
listinsidelist,[[index]]must be used. -
For returning a vector from a
data.frameordata.table,df[[one_list_index]]must be used. -
slicing:
- Slicing happens when you
[]a container (vector,list, etc) using more than one index, generated byseqor:orc(). The index used can be integers or charaters. - When slicing a list, a shallow copy of the subset of the original container will be created. That is, a new list will be created, but the elements in it will be just reference to the original with the
copy-on-modifysemantics.See here for more. - Positive integer slicing
- When slicing using positive integer(s), only the elements specified by the integers will be in the new subset.
- Negative integer slicing
- This works the opposite way of positve integer slicing. Only the elements specified by the integers will not be present in the subset. See here for more.
- Slicing happens when you
-
subset(x, sekect)functionsubsetfunction can be used to remove column easily:subset(df, select = -column_name_to_remove) # "column_name_to_remove" is not a character, it is just the name
-
Compare an array/data frame with a singel value and generate an array/data frame of same dim
- Compare each element of it with the value and the result can be indexed in the same way the array/data frame can be indexed. E.g.
v == valueordataframe$column_name == value.
- Compare each element of it with the value and the result can be indexed in the same way the array/data frame can be indexed. E.g.
-
Count
TRUEswhich(x), wherexis a logical vector/array, it returns an integer vector withlengthequal tosum(x), ie. the number ofTRUEs.sum(x)can also do a similar job, just likewhich.- It seems that
sum(bools)is faster thanlength(which(bools))when theboolsis considerably long.
-
Def function:
name_of_function = function(arg1, arg2 = 1) {# There can be default values to arguments # expr # The return statement is not always necessary. When there is only one expr in the function, the result of it will be # returned atomatically by R. return (expr) # If expr is omitted, NULL will be returned. expr can even be a funciton-
To be precise, I will call it the definition of
lambdainstead of normal function. -
Here, function is stored variable.
functioncan also be used inside of the definition of anotherfunctionbody. -
It is also worth noting that a function can access the variable that is defined in the env where the function is defined.
-
-
stop:stopis a class that can be constructed with a message and passed as function arguments. It stops the execution of the current expression and executes and error action.
-
forloop:for (each in collections) {# collections can be vector, list, data frame, matrix, etc) expr }-
Speeding up your R code - vectorisation tricks for beginners shows that loops are exensive on large data compared to
applyfunction family writen inRand the external call toCfunctions are even quicker. -
However, this is not always true. So it is better to do benchmark and understand what is under the hood to use them correctly.
-
-
while,if,elseworks just like inC -
switch:switchinRis like a function.switch(VALUE, COND1_ret_value, ...).
-
-
Builtin data structures:
vectorandlistvectorvectoris a homologous container. Since there is only one type of elements, the elements is stored continously.vectoralso has lower memory consumption compared tolistiflengthis not too large.vector(mode = "logical", length = 0)is used to construct anlength-longvectorstoring elements of typemode. For how elements are allocated, seehelp(vector).c(...)can be used to initialize avector. It can also be used to combine vectors, new elements of the same type to become onevector(notvectorofvector).
listlistis a heterogenous container, so it stores each elements by storing a pointer to it. It is very usefull since you get make alistoflistusinglist(...).c(...)can be used to combinelistand any other type of new elements together into onelist(notlistoflist).- To make
listoflist, you need to uselist(...)to combinelists.
- To append to a
listorvector, you need to uselist.append(.data, ...)from pacakgerlist, where.datais the container and...is the elements. - Insert: using
list.insert(.data, index, ...)fromrlist. push_front: usinglist.prepend(.data, ...)fromrlist.
vectoroflogical- To perform
&&,||or!action onvectoroflogical: use&,|or!.
- To perform
-
Builtin funcitons:
- help(x)?x
- ??x
- Provid manual page about x.
- object.size(x)
- Get the size of an aobject.
- rm(x)
- Delete the name
xand release its release if no other names use it (due to copy-on-modify semantics).
- Delete the name
- gc()
- Do garbage collection immediately. It can be usefull to call after a large object have been removed and return memory to the
- operating system. GC happens automatically without any user intervention, so normally a call to gc() isn't necessary and
- can hurt the performance if call it after the removal every object. For more, see
help(gc)and help(gctorture)`.
- help(Memory):
- Documents how objects are allocated in
R.
- Documents how objects are allocated in
-
Making packages
- write
DESCRIPTIONfile at the root of the project:
Package: Helloworld Title: What The Package Does (one line, title case required) Version: 0.1 Author: person("First", "Last", email = "[email protected]", Maintainer: Description: What the package does (one paragraph) Depends: R (>= 3.1.0) License: What license is it under? LazyData: true ByteCompile: true RoxygenNote: 6.1.1-
Put code into
root_of_pack/R/*.R. -
Then run
roxygenise()from packageroxygen2with current working dir at the root of the project orroxygenise(root_of_project).
The info above is from Creating R packages, the byte compiler and from running
vignette("roxygen2", package = "roxygen2")(it does not needlibrary("roxygen2")to work).-
Then run
R CMD check --check-subdirs=yes root_of_packand fix any error. -
Then run
R CMD build root_of_packto generate a*.tar.gz. -
Run
R CMD check --check-subdirs=yes *.tar.gzwhere*.tar.gzis generated by the previous step. -
RUn
R CMD INSTALL *.tar.gzto install the package.
For more info on packages, check here.
- write