Normal Approximations

In the 2016 presidential election, about 39% of eligible voters aged 18-24 actually voted. If that rate holds for the next presidential election, what is the probability that more than half of people in a sample 1000 18-24 year olds who are eligible will vote?

Find out how to register to vote in your state.

A red, white, and blue button that says Your Vote Counts.

The Normal Approximation to the Binomial Distribution

If the probability that a randomly selected person will vote in the next election is 0.39, how would we find the probability that more than half of the people in a sample of 1000 will vote? Since the number of trials is fixed (1000), the trials are independent (it's a random sample), each trial results in either a success (vote) or failure, and the probability of success is constant (p=0.39), a random variable that counts the number of successes in the sample has a binomial distribution.

Let X denote the number of people in the sample who will vote, $\small{X\sim Binomial(1000,0.39)}$. $\small{P(X>500) = P(X=501) + P(X=502) + \cdots + P(X=1000)}$. Though a computer can do this quickly, by hand, this is a daunting calculation. Understanding the normal approximation to the binomial distribution makes it possible to approximate such probabilities very quickly.

What is the normal approximation to the Binomial Distribution?

A binomial random variable is a sum of independent Bernoulli random variables. Thus, the Central Limit Theorem indicates that, under certain conditions, the distribution will be approximately normal.

Use the applet below to investigate the conditions under which the normal curve gives a close approximation to the binomial(n,p) distribution. Use the sliders to change the values of $n$ and $p$. Check 'Show Normal Curve' to plot the normal curve over the graph of the binomial distribution. Move the yellow dot on the x-axis to illustrate and find $\small{P(X \leq x)}$ using both distributions.

The Normal Approximation to the Binomial Distribution

The mean and variance of a binomial random variable, X, are $\small{E(X) = np}$ and $\small{Var(X) = np(1-p)}.$ Thus a $\small{N(np,np(1-p))}$ distribution approximates a Binomial(n,p) distribution. This approximation works best if $\small{np>5}$ and $\small{n(1-p)>5}$.

If $\small{X\sim Binomial(n,p)}$ then $\small{X\stackrel{.}{\sim}N(np, np(1-p)}$ if $\small{np>5}$ and $\small{n(1-p)>5}$.

Example: Voter Turnout

In the 2016 presidential election, about 39% of eligible voters aged 18-24 actually voted. If that rate holds for the next presidential election, what is the probability that more than half of people in a sample 1000 18-24 year olds, who are eligible, will vote?

Let X denote the number of people in the sample who will vote in the next election. $\small{X\sim Binomial(1000,0.39)}$. $\small{E(X) = np = 1000*0.39 = 390}$ and $\small{Var(X) = np(1-p) = 1000(0.39)(0.61) = 237.9}$. Thus $\small{X\stackrel{.}{\sim}N(390,237.9)}$.

$\small{P(X > 500) \approx P(Z > \frac{500-390}{\sqrt{237.9}}) = 1 - \Phi(7.132) = 4.98 \times 10^{-13}}$.

Using software to get the probabability from the binomial distribution we find $\small{P(X > 500) = 7.37\times 10^{-13}}$ So if the probability of voting remains at 0.39 it is very unlikely that more than half of a sample of 1000 people would vote. Of course, a variety of factors influence whether or not people vote. If the probability of voting increases, the probability that more than half of a sample of 1000 people votes could increase dramatically.

The Normal Approximation to the Poisson Distribution

What is the normal approximation to the Poisson Distribution?

Sums of i.i.d. Poisson random variables are also Poisson random variables, thus, the Central Limit Theorem implies that, under certain conditions, the Poisson distribution will closely follow the normal distribution.

Use the applet below to investigate the conditions under which the normal curve gives a close approximation to the $\small{Poisson(\lambda)}$ distribution. Use the slider to change the value of $\small{\lambda}$. Check 'Show Normal Curve' to plot the normal curve over the graph of the Poisson distribution.

The Normal Approximation to the Poisson Distribution

The mean and variance of a $\small{Poisson(\lambda)}$ random variable, X, are $\small{E(X) = \lambda}$ and $\small{Var(X) = \lambda}$. Thus a $\small{N(\lambda,\lambda)}$ distribution approximates a $\small{Poisson(\lambda)}$ distribution. This approximation works best if $\small{\lambda>5}$.

If $\small{X\sim Poisson(\lambda)}$ then $\small{X\stackrel{.}{\sim}N(\lambda, \lambda)}$ if $\small{\lambda>5}$.

Example: Bacteria

When a liquid solution with a pathogenic bacterium has an initial cell count of approximately 10⁹ CFU/ml ("Colony-Forming Units" estimate the number of viable bacteria in a sample), and undergoes a 10-fold dilution of bacterial cultures, chemists observe an average of 1 CFU/2 μl (one colony-forming unit per microliter) left in the solution after a 24-hour incubation period.

What is the probability that, after putting a pathogenic bacterium solution through a 10-fold dilution of bacterial cultures and then leaving it for a 24-hour incubation period, that in 100 microliters of solution, there are 40 or fewer colony-forming units?

Let X count the number of colony-forming units in 100 microliters of solution. Then since there is an average of 1CFU/2μl, the mean for 100μl is 50. Let $X\sim Poisson(50)$ then $X\stackrel{\cdot}{\sim}N(50,50)$.

Using the normal approximation, $P(X\leq 40)\approx 0.079$.

Using a calculator or software to find $P(X\leq 40)$ exactly, the result is 0.086.

Then there is about a 7.9% chance that there will be 40 or fewer colony-forming units of the bacteria in 100 μl of solution.