Stats Stuff

Analysis of Variance (ANOVA)

The Hypothesis Testing Process:

State hypotheses.
Collect data.
Construct a test statistic.
Compute a p-value.
Draw conclusions (in statistical terms and in context).

In R, there is a dataset titled warpbreaks that contains the number of warp breaks in yarn during weaving. Two different types of wool were weaved at three different levels of tension. Here is the structure of the data:

# no pec


        str(warpbreaks)

The numerical variable breaks is the number of warp breaks.
The categorical (factor) variable wool contains the two types of wool, A and B.
The categorical variable tension contains the three levels of tension, L, M and H.

We are interested in seeing if there is a difference in the number of warp breaks between the three tension levels.

State hypotheses

Suppose that we want to test to see if the mean number of warp breaks is the same at all three tension levels.
$H_0: \mu_L = \mu_M = \mu_H$
$H_A: \mu_i \neq \mu_j \text{ for some }i,j.$

Collect data

We will use the warpbreaks dataset in R.

Construct a test statistic and compute a p-value.

To conduct an Analysis of Variance (ANOVA) in R, we will use the aov() function.

# no pec


          aov(warpbreaks$breaks ~ warpbreaks$tension)

Notice that we are using '~' notation to specify that we comparing the numerical variable, breaks, between the tension groups, tension.

Just using the aov() function by itself is not very useful. To get the ANOVA table (with the test statistic and p-value), nest aov() inside of summary().

# no pec


          summary(aov(warpbreaks$breaks ~ warpbreaks$tension))

Using summary(aov()) is much more insightful for your ANOVA analyses.

Draw conclusions.

Using $\alpha = 0.05$, we can see that our p-value from the ANOVA table above is less than $\alpha$. $$pval = 0.00175 < 0.05 = \alpha$$
Therefore, we can reject our null hypothesis and say that there is evidence that the mean number of warp breaks between the 3 levels of tension are not all equal.

Pairwise Comparisons - (only if we reject the null hypothesis for ANOVA)

To make pairwise comparisons of each pair of populations, we will use the TukeyHSD() function with aov().

# no pec


          TukeyHSD(aov(warpbreaks$breaks ~ warpbreaks$tension))

Looking at the p adj column, we see that two out of the three p-values are less than $\alpha = 0.05$. This tells us that there is a difference in the average number of warp breaks between the 'M' and 'L' tensions and the 'H' and 'L' tensions. The average number of warp breaks between 'H' and 'M' could be equal.

By creating a side-by-side boxplot, you can also see these pairwise differences graphically.

# no pec


          boxplot(warpbreaks$breaks ~ warpbreaks$tension,
                  main = "Boxplot of Yarn Warp Breaks",
                  ylab = "Number of Breaks",
                  xlab = "Tension Level",
                  col = c("#66c2a5", "#fc8d62", "#8da0cb"))

Looking at this comparative boxplot, we can see that this provides confirmatory evidence regarding our conclusions above. There is a difference in the average number of warp breaks between the 'M' and 'L' tensions and the 'H' and 'L' tensions. The average number of warp breaks between 'H' and 'M' are more similar and their difference in the number of breaks could be due to chance.

Video Tutorial:

Another ANOVA example - Chick Weights

In R, there is a dataset titled chickwts that contains the weight of chickens on various feed supplements. There were six different feed types. Here is the structure of the data:

# no pec


        str(chickwts)

The numerical variable weight is the weight of the chicks.
The categorical (factor) variable feed contains the six types of feed.

We are interested in seeing if there is a difference in the average weight of chicks between the six feed supplements.

State hypotheses about the parameter

Suppose that we want to test to see if the mean weight of chicks is the same for all 6 feed supplements.
$H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu_5 = \mu_6$
$H_A: \mu_i \neq \mu_j \text{ for some }i,j.$

Collect data

We will use the chickwts dataset in R.

Construct a test statistic and compute a p-value.

To conduct an Analysis of Variance (ANOVA) in R, we will use the aov() function along with summary().

# no pec


          summary(aov(chickwts$weight ~ chickwts$feed))

Draw conclusions.

Using $\alpha = 0.05$, we can see that our p-value from the ANOVA table above is very small compared to $\alpha$. $$pval = 5.94 \times 10^{-10} < 0.05 = \alpha$$
Therefore, we can reject our null hypothesis and say that there is evidence that the mean weight of chicks are not all equal between the different feed supplements.

Pairwise Comparisons - (only if we reject the null hypothesis for ANOVA)

To make pairwise comparisons of each pair of populations, we will use the TukeyHSD() function with aov().

# no pec


          TukeyHSD(aov(chickwts$weight ~ chickwts$feed))

Looking at the p adj column, we see that there are quite a few feed supplements that are significantly different from each other and some that are more similar.

By creating a side-by-side boxplot, you can also see these pairwise differences graphically.

# no pec


          boxplot(chickwts$weight ~ chickwts$feed,
                  main = "Boxplot of Chick Weights",
                  ylab = "Weight (grams)",
                  xlab = "Feed Supplement",
                  col = c("#8dd3c7", "#ffffb3", "#bebada", "#fb8072", "#80b1d3",
                          "#fdb462"))

Looking at this comparative boxplot, we can see that this provides confirmatory evidence regarding our conclusions above. For example, since the p-value for the difference between horsebean and casein is less than 0.05, we would expect a difference in the mean weight of chicks on these two feed types.

In contrast, there wouldn't be much of a difference between the mean chick weight for the linseed and soybean feed types because their p-value is greater than 0.05.