All knowledge degenerates into probability.

Top

Site Menu

The Central Limit Theorem

The Central Limit Theorem (CLT) says that the distribution of a sum of independent random variables from a given population converges to the normal distribution as the sample size increases, regardless of what the population distribution looks like. Since means and proportions are linear combinations of sums, their distributions also converge to the normal distribution. In practice, this implies that, if the sample size is large enough, the distribution of a sum, mean, or proportion is approximately normal.

We've seen previously that sums of normally distributed random variables are normally distributed. The Central Limit Theorem indicates that sums of independent random variables from other distributions are also normally distributed when the random variables being summed come from the same distribution and there is a large number of them (usually 30 is large enough).

Note: When random variables are independent and from the same distribution, we say that they are independent and identically distributed or i.i.d.

NOTATION: $\stackrel{\cdot}{\sim}$ indicates an approximate distribution, thus $X\stackrel{\cdot}{\sim}N(\mu, \sigma^2)$ reads 'X is approximately $N(\mu, \sigma^2)$ distributed'.




If $X_1, X_2, \ldots X_n$ are independent and identically distributed random variables such that $E(X_i) = \mu$ and $Var(X_i) = \sigma^2$ and n is large enough,


"Time headway" in traffic flow is the elapsed time between when one car finishes passing a fixed point and the instant that the next car begins to pass that point. Let $X_i$ denote the time headway, in seconds, for a randomly chosen pair of consecutive cars on a freeway. Then $f(x)=.15e^{-.15(x-.5)}$, for $x \geq 0.5$, and $f(x)=0$ otherwise. $\small{E(X_i) = 7.167}$ and $\small{Var(X_i) = 44.44}$.

Use the CLT to find the approximate distribution of total seconds of time headway for 50 randomly selected pairs of consecutive cars on a freeway.


Since 50 is a "large enough" sample size, the Central Limit Theorem indicates that $\small{\sum_{i=1}^{50}X_i \stackrel{.}{\sim}N\left(50\cdot7.167, 50\cdot 44.44\right)}$. That is $\small{\sum_{i=1}^{50}X_i \stackrel{.}{\sim}N\left(358.35, 2222\right)}$.


"Time headway" in traffic flow is the elapsed time between when one car finishes passing a fixed point and the instant that the next car begins to pass that point. Let $X_i$ denote the time headway, in seconds, for a randomly chosen pair of consecutive cars on a freeway. Then $f(x)=.15e^{-.15(x-.5)}$, for $x \geq 0.5$, and $f(x)=0$ otherwise.

What is the probability that the mean seconds of time headway for 100 randomly selected pairs of consecutive cars on a freewayis greater than 7.5 seconds?


$\small{E(X_i) = 7.167}$ and $\small{Var(X_i) = 44.44}$.

Since 100 is a "large enough" sample size, the Central Limit Theorem indicates that $\small{\bar{X}_{100} =\frac{1}{100}\sum_{i=1}^{100}X_i \stackrel{.}{\sim}N\left(7.167, \frac{44.44}{100}\right)}$. That is $\small{\bar{X}_{100} \stackrel{.}{\sim}N\left(7.167, 0.4444\right)}$.

$\small{P(\bar{X}>7.5) = P\left(Z > \frac{7.5-7.167}{\sqrt{0.4444}}\right) = 1 - \Phi(0.5) = 0.309}.$

A bell-shaped normal distribution curve with a shaded area under the curve starting at approximately 7.5 and extending to the right tail. The shaded region represents the upper portion of the distribution beyond the threshold at 7.5. The x-axis ranges from about 5.5 to 8.5.