Confidence Intervals for Means
The structure of a confidence interval consists of three pieces: a point estimate, a critical value, and the standard error of the statistic.
Point Estimates
A point estimate for a mean is usually denoted as $\bar{X}$. We can use the mean() function in R.# no pec
xbar <- mean(iris$Petal.Length)
xbar
The mean petal length for the flowers in the iris dataset is 3.758.
$\bar{X} = 3.758$
Critical Values
A critical value for a confidence interval for a mean is taken from a t distribution. Our critical value is dependent on our value of $\alpha$ and the sample size, $n$.We will use the length() function to find the number of observations in the sample, and the qt() function to calculate the critical value.
# no pec
n <- length(iris$Petal.Length)
n
alpha <- 0.05
t_crit <- qt(alpha / 2, df = n - 1, lower.tail = FALSE)
t_crit
$t_{\alpha/2,\, n-1}=t_{.025,\, 149} = 1.976013$
Standard Error
To get the standard error for a mean, we will need the sample standard deviation, $s$, and the sample size, $n$.We will use the sd() function to find the sample standard deviation.
n <- length(iris$Petal.Length)
s <- sd(iris$Petal.Length)
s
st_error <- s / sqrt(n)
st_error
$s.e.(\bar{X}) = \frac{s}{\sqrt{n}} = 0.144136$
Creating the Confidence Interval
To create the confidence interval, we just need to combine all the pieces together:$\bar{X} \pm t_{\alpha/2,\, n-1} * s.e.(\bar{X})$
n <- length(iris$Petal.Length)
alpha <- 0.05
xbar <- mean(iris$Petal.Length)
t_crit <- qt(alpha / 2, df = n - 1, lower.tail=FALSE)
s <- sd(iris$Petal.Length)
st_error <- s / sqrt(n)
#lower bound
xbar - t_crit * st_error
#upper bound
xbar + t_crit * st_error
# no pec
t.test(iris$Petal.Length, conf.level = 0.95)
Don't worry about the rest of the output for right now. It will be discussed in the hypothesis testing section.
Video Tutorial:
Confidence Intervals for Proportions
Point Estimates
A point estimate for a proportion is usually denoted as $\hat{p}$.# no pec
HairEyeColor
Notice that this dataset is separated into the categories of Sex, Hair Color, and Eye Color. To find the proportion of just red haired people, I will sum up all of the cells that contain a red haired person (third row in both tables).
Number of Red-Haired People $= 10+10+7+7+16+7+7+7 = 71$
In order to get the value of $\hat{p}$, we need the sample size, $n$. We will use the sum() function, in this instance, to find the total number of people in the sample.
# no pec
n <- sum(HairEyeColor)
n
phat <- 71 / n # 71 is the number of red-haired people
phat
$\hat{p} = \frac{71}{592} = 0.1199324$
$n = 592$
Critical Values
A critical value for a confidence interval for a proportion is taken from a standard normal distribution. Our critical value is dependent on our value of $\alpha$.We will use the qnorm() function to calculate the critical value.
# no pec
alpha <- 0.05
z_crit <- qnorm(alpha / 2, lower.tail = FALSE)
z_crit
$z_{\alpha/2}=z_{.025} = 1.959964$
Standard Error
To get the standard error for a proportion, we will simply need the values of $\hat{p}$ and $n$.
n <- sum(HairEyeColor)
phat <- 71 / n
st_error <- sqrt(phat * (1 - phat) / n)
st_error
$s.e.(\hat{p}) = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = 0.01335259$
Creating the Confidence Interval
To create the confidence interval, we just need to combine all the pieces together:$\hat{p} \pm z_{\alpha/2} * s.e.(\hat{p})$
n <- sum(HairEyeColor)
alpha <- 0.05
phat <- 71 / n
z_crit <- qnorm(alpha / 2, lower.tail = FALSE)
st_error <- sqrt(phat * (1 - phat) / n)
#lower bound
phat - z_crit * st_error
#upper bound
phat + z_crit * st_error