Which of the following is a principle of analytic graphics?
-
Make judicious use of color in your scatterplots (NO)
-
Don't plot more than two variables at at time (NO)
| # Getting and Cleaning Data Project John Hopkins Coursera | |
| # Author: Michael Galarnyk | |
| # 1. Merges the training and the test sets to create one data set. | |
| # 2. Extracts only the measurements on the mean and standard deviation for each measurement. | |
| # 3. Uses descriptive activity names to name the activities in the data set | |
| # 4. Appropriately labels the data set with descriptive variable names. | |
| # 5. From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject. | |
| # Load Packages and get the Data |
| <html> | |
| <header><title>This is title</title></header> | |
| <body> | |
| Hello world | |
| </body> | |
| </html> |
This assignment uses data from the UC Irvine Machine Learning Repository, a popular repository for machine learning datasets. In particular, we will be using the “Individual household electric power consumption Data Set” which I have made available on the course web site:
Dataset:
Electric power consumption [20Mb]
Description: Measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available.
library("data.table")github repo for rest of specialization: Data Science Coursera
R was developed by statisticians working at...
The University of Auckland
github repo for rest of specialization: Data Science Coursera
Suppose I define the following function in R
github repo for rest of specialization: Data Science Coursera
Take a look at the 'iris' dataset that comes with R. The data can be loaded with the code:
library(datasets)github repo for rest of specialization: Data Science Coursera
What is produced at the end of this snippet of R code?
set.seed(1)