Last active
August 2, 2016 12:17
-
-
Save alanmarazzi/8f0b32d5b60c07b695e8cbab1c1005dd to your computer and use it in GitHub Desktop.
Basic R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# As a primer you saw also earlier that I wrote | |
# some text in the code, but it wasn't evaluated. | |
# These are comments, you can use them by simply | |
# using "#" before the text. | |
###### I can add also more "#" to draw the attention | |
# on that comment. | |
# Assign names | |
# We can do math with R as we would do in a simple calculator | |
5 + 6 # Returns 11 | |
2 * 5 # Returns 10 | |
2 ^ 2 # Returns 2 to the power of 2 = 4 | |
10 / 2 # Returns 5 | |
# But this is an inefficient way to deal with data: think about a 1000 rows dataset, | |
# and you want to find the sum. | |
# It's going to take a bit to sum by hand every value in it | |
# So we can assign names to values | |
x <- 5 | |
x # This prints the value of x, so we will se 5 | |
x <- 7 | |
x # As you can see the value of x has changed to 7 | |
x = 5 # Is the same as above, you can use the one you prefer either "<-" or "=" | |
x | |
# We can store whatever we want in a name | |
x <- "hello world" | |
x | |
x <- 5 * 5 # R will evaluate the expression "5 * 5" and then store its result in x | |
x | |
# We can do operations with variables | |
x <- 5 | |
y <- 2 | |
x * y # Returns 10 | |
x + y # Returns 7 | |
x # The values of x and y are unchanged | |
y | |
z <- x + y * x / y | |
z | |
x | |
y | |
# Getting back to our previous example, to manage many elements we need vectors | |
x <- c(1, 2, 3, 4, 5) # We create vectors with the c() function | |
x | |
# Vectors can contain integers (as above), floats or strings | |
x <- c(1.5, 2.3, 7.4) # Floats | |
y <- c("hello", "world", "bye", "moon") # Strings of text | |
# We can combine vectors, | |
# but remember that one vector can contain only one type among floats or strings | |
z <- c(x, 3, 5) | |
z | |
z <- c(x, y) # Here R converts the floats in x to strings | |
z | |
# Operations with vectors | |
x <- c(1, 2, 3) | |
y <- c(4, 5, 6) | |
x + y # R sums every value of x with the corresponding value of y | |
x * y | |
# What if we want to sum only the first value of x with the second value of y? | |
# We use indexing | |
x[1] # This prints the first element of x | |
y[2] # This prints the second element of y | |
x[1] + y[2] # Returns 6 | |
x[1:2] # Returns the first two elements of x | |
x[1:2] + y[2:3] # Returns 6, 8 | |
# Above we used ":" between two indexes | |
# we can use it also to generate sequences of values | |
x <- 1:10 | |
x # Returns 1 2 3 4 5 6 7 8 9 10 | |
x <- 10:1 | |
x # Returns the same sequence as before, but inverted | |
# Returning to the same example, 1000 values to sum. | |
# We will use a couple of functions to demonstrate | |
# the speed and the capability of vectors and R | |
set.seed(123) | |
x <- runif(1000) | |
# runif() is a function, a function takes various | |
# arguments and will do some operation with them. | |
# When in doubt go with help "?functionName". | |
# In this case we generate 1000 random numbers. | |
# set.seed() is another function that blocks the seed | |
# for random generation, we need it otherwise you wouldn't get | |
# the same results as me | |
# (see previous sections Install R and Install RStudio) | |
sum(x) # sum() is a function that sums every element in a vector | |
# We can do the same for larger vectors | |
x <- runif(100000) | |
sum(x) # Returns 49931.65 | |
x <- runif(1000000) | |
sum(x) # Returns 499638.4 | |
# Data Frame | |
# Usually when we deal with data we want them in tabular format: | |
# organized by columns and rows. | |
# R has an internal method to deal with tabular data: data frames | |
x <- 1:10 | |
y <- letters[1:10] # We take the first 10 letters of the alphabet | |
z <- rep(c("male", "female"), 5) # rep() replicates the first argument x times | |
df <- data.frame(id = x, status = y, sex = z) | |
df | |
# We created a dataframe from 3 vectors, | |
# as you can see it's organized as a table, and every column is a vector. | |
# We can call single or groups of rows and columns and do operations on them | |
df[1,1] # Returns the first row in the first column | |
df[1, ] # Returns the whole first row | |
df[ ,1] # Returns the first column | |
df[ ,"sex"] # Returns the column "sex" | |
# Another way to select columns is with the "$" operator | |
df$sex # Returns column "sex" as well | |
sum(df$id) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment