Skip to content

Instantly share code, notes, and snippets.

@samlexrod
Last active March 14, 2018 23:10
Show Gist options
  • Save samlexrod/e720dfbd6bda2142ea5c223bc1bd98b7 to your computer and use it in GitHub Desktop.
Save samlexrod/e720dfbd6bda2142ea5c223bc1bd98b7 to your computer and use it in GitHub Desktop.
Here is my R Statistics Tutorial for those interested in learning Basic Statistics with R. I will be developing the structure of the lecture as I come with new ideas and work on the drafts. I will explain statistical concepts in dept as well as the code utilized in R.

Understanding the Cumulative Distribution Function (CDF)

Uderstanding the Normal Distribution

$$\Pr(a<x<b) =\int_a^b \frac{1}{\sqrt{2\pi s}}e^\frac{-1}{2}(\frac{x-m}{s})^2$$

Understanding the Empirical Rule

Intro:

Who does not like pizza? Well, maybe some people. But let's start by explaining the proportions of a whole pizza.

Using pnorm() to find proportions in the distribution:

Creating a Matrix of Normal Distributions

sampling.dist <- matrix(rnorm(2000), nrow = 100))

Explanation:

rnorm(100)
rnorn(100, mean=0, sd=1)

This function takes an argument and creates a vector the size of the argument given of randomized values. The values will represent a normal distribution. The default mean is 0 and standard deviation is 1. The above example creates a normal distribution of 100 values.

matrix(rnorm(100), nrow=25))
matrix(rnorm(100), nrow=24))

This function creates the matrixes of the given arguments. In the above example, the function is creating a matrix of a vector with 100 randomized values representing a normal distribution. It is going to take the first 25 values and save them in the first column of the matrix. From index 1 to 25 Since there are 100 values, it is going to create 4 columns with 25 values each (100/25). By not stating the size of the columns (ncol), the function creates the matrix with as many columns as needed to fill all the rows as stated in the nrow parameter. Trying the second code will give you an error as the function cannot fill the matrix in rows of 24 values evenly. So it will create 4 rows filled with completely with 96 values in total, and a 5th row filled partially with the remaining 4 values. The rest of the values in column 5 will be repeated starting from index 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment