Any sufficiently advanced technology is equivalent to magic.

Top

Site Menu

Two Sample Hypothesis Tests - Paired Sample

The Hypothesis Testing Process:
  1. State hypotheses about the parameter.
  2. Collect data.
  3. Construct a test statistic.
  4. Compute a p-value.
  5. Draw conclusions (in statistical terms and in context).
In R, there is a dataset titled sleep that contains the results from a paired sample $t$-test. 10 subjects were each given two different sleep-inducing drugs and the amount of extra sleep they got with each of these drugs, compared to control was recorded. Here is what the data look like:

# no pec sleep

The variable extra is the amount of extra sleep the subject got compared to control.
The variable group contains the two categories of sleep-inducing drugs.
The variable ID contains the individual subjects unique ID number.

In addition, we can see that extra is a numerical variable, and group and ID are categorical (factor) variables.
# no pec str(sleep)

State hypotheses about the parameter

Suppose that we want to test to see if the mean amount of extra sleep a participant gets is different between the two drugs.
$H_0: \mu_1 - \mu_2 = 0$
$H_A: \mu_1 - \mu_2 \neq 0$

Collect data

We will use the sleep dataset in R.

Construct a test statistic and compute a p-value.

Similarly to one-sample hypothesis tests, we will use the t.test() function.

# no pec t.test(sleep$extra ~ sleep$group, paired = TRUE, mu = 0)

Draw conclusions.

Using $\alpha = 0.05$, we can see that our p-value is less than $\alpha$. $$pval = 0.002833 < 0.05 = \alpha$$
Therefore, we can reject our null hypothesis and say that there is evidence that there is a difference in the average amount of extra sleep subjects receive between these two sleep-inducing drugs.

If you wanted to investigate further and see which drug provided more sleep on average, constructing a side-by-side boxplot is a good way to start.

# no pec boxplot(sleep$extra ~ sleep$group, main = "Boxplot of Extra Sleep", ylab = "Hours of Extra Sleep", names = c("Drug 1", "Drug 2"), ylim = c(-2, 6), col = c("steelblue1", "royalblue"))

Video Tutorial:

Two Sample Hypothesis Tests - Independent Samples

State hypotheses

Suppose that we want to test to see if the mean rate of miles per gallon (mpg) is different between automatic and manual transmission cars.
$H_0: \mu_A - \mu_M = 0$
$H_A: \mu_A - \mu_M \neq 0$

Collect data

We will use the mtcars dataset in R.  (first used on Numerical Data Summary page)

Construct a test statistic and compute a p-value.

Similarly to one-sample hypothesis tests, we will use the t.test() function.

# no pec #Remember that group '0' is for automatic, '1' is for manual. t.test(mtcars$mpg ~ as.factor(mtcars$am), paired = FALSE, mu = 0)

Draw conclusions.

Using $\alpha = 0.05$, we can see that our p-value is less than $\alpha$. $$pval = 0.001374 < 0.05 = \alpha$$
Therefore, we can reject our null hypothesis and say that there is evidence that there is a difference in the average rate of miles per gallon (mpg) between automatic and manual transmission cars.

If you wanted to investigate further and see which transmission type provided better gas mileage on average, constructing a side-by-side boxplot is a good way to start.

# no pec boxplot(mtcars$mpg ~ as.factor(mtcars$am), main = "Boxplot of Miles per Gallon in Cars", ylab = "Miles per gallon (mpg)", xlab = "Transmission Type", names = c("Automatic", "Manual"), ylim = c(10, 40), col = c("#003056", "#8A8D8F"))

Video Tutorial: