cwickham · August 29, 2015 14:14
diff --git a/08-ftest-exercises.Rmd b/08-ftest-exercises.Rmd
 ---
 title: "F-test exercises"
 author: "ST552 Winter 2015"
 date: "January 23, 2015"
 output: html_document
 ---


 > In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.
 >
 > `cheddar` is a data frame with 30 observations on the following 4 variables:
 >   
 > taste, a subjective taste score  
 > Acetic, concentration of acetic acid (log scale)  
 > H2S, concentration of hydrogen sulfice (log scale)  
 > Lactic, concentration of lactic acid

 The following model:
 $$
 \textbf{Full Model:} \quad \text{taste}_i = \beta_0 + \beta_1 \text{Acetic}_i + \beta_2 \text{H2S}_i +
 \beta_3 \text{Lactic}_i + \epsilon_i
 $$
 was fit in R and the output is shown below.
 ```{r}
 data(cheddar, package = "faraway")
 # full model, aka Model 6
 fit <- lm(taste ~ . , data = cheddar)
 summary(fit)
 ```

 ```{r, echo=FALSE, results='hide'}
 # M1
 fit_mean_only <- lm(taste ~ 1, data = cheddar)
 # M2
 fit_acetic <- lm(taste ~ Acetic, data = cheddar)
 # M3
 fit_LH <- lm(taste ~ Lactic + H2S - 1, data = cheddar)
 # M4
 fit_notacetic <- lm(taste ~ Lactic + H2S, data = cheddar)
 # M5
 fit_subspace <- lm(taste ~ I(Acetic + H2S) + Lactic, data = cheddar)
 ```

 These alternative models were also fit:
 $$
 \begin{aligned}
 \textbf{Model 1:} \quad \text{taste}_i &= \beta_0 \quad &\text{RSS } = `r round(deviance(fit_mean_only), 1)`\\
 \textbf{Model 2:} \quad \text{taste}_i &= \beta_0 + \beta_1 \text{Acetic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_acetic), 1)`\\
 \textbf{Model 3:} \quad \text{taste}_i &= \beta_1 \text{H2S}_i + \beta_2 \text{Lactic}_i +
 \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_LH), 1)`\\
 \textbf{Model 4:} \quad \text{taste}_i &= \beta_0  + \beta_1 \text{H2S}_i 
 + \beta_2 \text{Lactic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_notacetic), 1)`\\
 \textbf{Model 5:} \quad \text{taste}_i &= \beta_0 + \beta_1 (\text{Acetic}_i + \text{H2S}_i) +
 \beta_2 \text{Lactic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_subspace), 1)`\\
 \textbf{Model 6:} \quad \text{taste}_i &= \beta_0 + \beta_1 \text{Acetic}_i + 
  \beta_2 \text{H2S}_i + \beta_3 \text{Lactic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit), 1)`
 \end{aligned}
 $$

 \begin{enumerate}
 \item{Find the overall regression F-statistic.  Where is this reported in the R output?}
 ```{r, include=FALSE}
 anova(fit_mean_only, fit)
 ```

 \item{Find the F-statistic for testing the null hypothesis that $\beta_1$ = 0 in the full model.  What distribution should this statistic be compared to? Identify the equivalent t-test statistic and p-value in the \verb|lm| output.}
 ```{r, include=FALSE}
 anova(fit_notacetic, fit)
 ```
 \item{Find the F-statistic for testing the null hypothesis that $\beta_1$ = 0 in \textbf{Model 2}. The p-value for this test is $0.0017$, why is this conclusion different to the one above?}
 ```{r, include=FALSE}
 anova(fit_mean_only, fit_acetic)
 ```

 \item{Find the F-statistic for testing the null hypothesis that $\beta_0 = \beta_1 = 0$ in the full model. Can you predict the conclusion from the R output?}
 ```{r, include=FALSE}
 anova(fit_LH, fit)
 ```

 \item{Find the F-statistic for testing the null hypothesis that $\beta_1 = \beta_2$ in the full model. What distribution should this F-statistic be compared to?}
 ```{r, include=FALSE}
 anova(fit_subspace, fit)
 ```
 \end{enumerate}
	---
	title: "F-test exercises"
	author: "ST552 Winter 2015"
	date: "January 23, 2015"
	output: html_document
	---


	> In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.
	>
	> `cheddar` is a data frame with 30 observations on the following 4 variables:
	>
	> taste, a subjective taste score
	> Acetic, concentration of acetic acid (log scale)
	> H2S, concentration of hydrogen sulfice (log scale)
	> Lactic, concentration of lactic acid

	The following model:
	$$
	\textbf{Full Model:} \quad \text{taste}_i = \beta_0 + \beta_1 \text{Acetic}_i + \beta_2 \text{H2S}_i +
	\beta_3 \text{Lactic}_i + \epsilon_i
	$$
	was fit in R and the output is shown below.
	```{r}
	data(cheddar, package = "faraway")
	# full model, aka Model 6
	fit <- lm(taste ~ . , data = cheddar)
	summary(fit)
	```

	```{r, echo=FALSE, results='hide'}
	# M1
	fit_mean_only <- lm(taste ~ 1, data = cheddar)
	# M2
	fit_acetic <- lm(taste ~ Acetic, data = cheddar)
	# M3
	fit_LH <- lm(taste ~ Lactic + H2S - 1, data = cheddar)
	# M4
	fit_notacetic <- lm(taste ~ Lactic + H2S, data = cheddar)
	# M5
	fit_subspace <- lm(taste ~ I(Acetic + H2S) + Lactic, data = cheddar)
	```

	These alternative models were also fit:
	$$
	\begin{aligned}
	\textbf{Model 1:} \quad \text{taste}_i &= \beta_0 \quad &\text{RSS } = `r round(deviance(fit_mean_only), 1)`\\
	\textbf{Model 2:} \quad \text{taste}_i &= \beta_0 + \beta_1 \text{Acetic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_acetic), 1)`\\
	\textbf{Model 3:} \quad \text{taste}_i &= \beta_1 \text{H2S}_i + \beta_2 \text{Lactic}_i +
	\epsilon_i \quad &\text{RSS } = `r round(deviance(fit_LH), 1)`\\
	\textbf{Model 4:} \quad \text{taste}_i &= \beta_0 + \beta_1 \text{H2S}_i
	+ \beta_2 \text{Lactic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_notacetic), 1)`\\
	\textbf{Model 5:} \quad \text{taste}_i &= \beta_0 + \beta_1 (\text{Acetic}_i + \text{H2S}_i) +
	\beta_2 \text{Lactic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit_subspace), 1)`\\
	\textbf{Model 6:} \quad \text{taste}_i &= \beta_0 + \beta_1 \text{Acetic}_i +
	\beta_2 \text{H2S}_i + \beta_3 \text{Lactic}_i + \epsilon_i \quad &\text{RSS } = `r round(deviance(fit), 1)`
	\end{aligned}
	$$

	\begin{enumerate}
	\item{Find the overall regression F-statistic. Where is this reported in the R output?}
	```{r, include=FALSE}
	anova(fit_mean_only, fit)
	```

	\item{Find the F-statistic for testing the null hypothesis that $\beta_1$ = 0 in the full model. What distribution should this statistic be compared to? Identify the equivalent t-test statistic and p-value in the \verb\|lm\| output.}
	```{r, include=FALSE}
	anova(fit_notacetic, fit)
	```
	\item{Find the F-statistic for testing the null hypothesis that $\beta_1$ = 0 in \textbf{Model 2}. The p-value for this test is $0.0017$, why is this conclusion different to the one above?}
	```{r, include=FALSE}
	anova(fit_mean_only, fit_acetic)
	```

	\item{Find the F-statistic for testing the null hypothesis that $\beta_0 = \beta_1 = 0$ in the full model. Can you predict the conclusion from the R output?}
	```{r, include=FALSE}
	anova(fit_LH, fit)
	```

	\item{Find the F-statistic for testing the null hypothesis that $\beta_1 = \beta_2$ in the full model. What distribution should this F-statistic be compared to?}
	```{r, include=FALSE}
	anova(fit_subspace, fit)
	```
	\end{enumerate}
No results found