Start Rstudio Create Project Create a Data Directory and Rscript
Explain:
- diff btw console and script
+
sign means waiting for more input- commenting
- variable assignment
- functions and aruments (
sqrt
+round
) - using ??
Rest of lesson goes over how to get help online
Show webpage script example
Explain:
- math with assigned variables (exercise)
- numeric and character vectors
- length, class, str of vectors
- concatentate values to vectors
- subsetting vectors (challenge)
Download and read csv datafile
- inspect with
head()
andstr()
(exercise)
factors are:
- categorical variables
- neccessary for many statistical operations
- ordered alphabetically be default
Explain:
nlevels()
,levels()
- try math with factors - convert to numeric (exercise)
Create a DF:
- Imported from spreadsheet
- Manually
- (exercise)
Inspect a DF using:
dim
,nrow
,ncol
,head
,tail
,names
,rownames
,str
,summary
- subsetting indices (explain sequences) (exercise)
- subsetting by column name
Explain:
- installing and loading packages
- bracket subsetting can be complicated so
dplyr::
filter
,select
- create new columns using
mutate
- calculate stats on groups using
group_by
Use baseplotting to plot weight vs hindfoot length Use ggplot2 to plot weight vs hindfoot length
- add transparency
- add colors
Make ggplot2 boxplot of species id vs weight
- add points with
geom_jitter
Make ggplot2 timeseries plot of species_id counts by year with geom_line
- seperate line (group) by
species_id
- color by
species_id
Create seperate plot of each species using facet_wrap
Introduce themes and labels (labs
)
Recap benefits of databases
- reproducible analysis
- ability to make very large querys
- Connect to our sqlite file