All knowledge degenerates into probability.

Top

Site Menu

Expected Value and Variance

On a certain track team, the runners all take between 4 and 7 minutes to finish a mile. Suppose the probability density function for the length of time it takes a runner to finish a mile is $f(x)=\frac{4}{21}(x-4.5)^2, 4\leq x \leq 7$. Across all runners on the track team, what is the average time it takes to complete the mile?

A red running track with white lane markings, numbered 1, 2, and 3 at the starting line.

An advantage of using random variables to describe experimental outcomes is that they make it possible to talk about the outcomes mathematically and to address questions such as 'What is the average outcome?' and 'How much are the outcomes likely to vary?' The answers to these questions are given by the expected value and the variance of the random variable. The expected value is a measure of the center of a distribution and the variance describes the spread of possible values around the expected value.



The shape of the density function shown here depends on the expected value ($\mu$) and variance $\sigma^2$ or standard deviation, $\sigma$.

Use the sliders to adjust the expected value and variance and to see how that affects the shape and position of the graph.

The expected value for this distribution could be any real number and the standard deviation can be any positive number, however, the graph has a very limited range for both.

The Expected Value of A Random Variable

The expected value of a random variable is also referred to as the 'expectation' or the 'mean' of the variable. It is an average of the possible experimental outcomes weighted by their probabilities.

→ The expected value of random variable $X$ is the mean of the distribution of $X$.

NOTATION: The expected value of random variable X is denoted $E(X)$, $\mu_x$, or just $\mu$.

Expected Value of a Discrete Random Variable

To find the expected value of a discrete random variable, sum the possible values weighted by their probabilities.



$E(X) = \sum_{i=1}^np_ix_i$



At the HuHot Mongolian Grill, there are 1-flame, 2-flame, 3-flame, all the way up to 7-flame sauces to choose from (the number of flames indicates how hot the sauce is). When customers come through the line, 17% of them choose a 1-flame sauce for their dish, 14% choose a 2-flame, 26% choose a 3-flame, 19% choose a 4-flame, 10% choose a 5-flame, 8% choose a 6-flame, and 6% choose a 7-flame. When a randomly selected HuHot customer's dish is selected, what is the expected value for the flame rating of the sauce on that dish?

Let X denote the number of flames in the sauces rating. The probability mass funciton for X is given in the table.
x 1 2 3 4 5 6 7
P(X=x) 0.17 0.14 0.26 0.19 0.10 0.08 0.06

The expected value sums the possible outcomes of the random variable weighted by (multiplied by) their probabilities.

$\small{\begin{array}{rcl} E(X) &=& \sum_{i=1}^np_ix_i\\ &=& 0.17(1)+0.14(2)+0.26(3)+0.19(4)+0.1(5)+0.08(6)+0.06(7) \\ &=& 3.39\end{array}}$.

The expected value for the number of flames in the sauce rating of a randomly selected HuHot customer's dish is 3.39.


A 5-prompt multiple-choice quiz has four possible solutions for each prompt, one of which is correct. If a student randomly selects answers, there probability of choosing correctly is 0.25 for each prompt.

Let $X$ denote the number of prompts a randomly guessing student gets correct out of the 5. The probability distribution of X is given by the formula $\small{P(X = x) = \binom 5x 0.25^x(1-0.25)^{5-x}}$. (This a binomial(5, 0.25) distribution).

What is the expected quiz score if a student randomly guesses the answer on every question?

This is a discrete distribution, so to find the expected value, sum the possible outcomes multiplied by their probabilities.

The score can be 0, 1, 2, 3, 4, or 5. The probabilities associated with these can be computed from the above formula.

$E(X) = \sum_{i=1}^np_ix_i$, $n=6$ and $x_1=0, x_2=1, \ldots x_6=5$.

$\small{\begin{array}.E(X) &=& \binom 50 0.25^0(1-0.25)^{(5-0)} \times 0\\ &+& \binom 51 0.25^1(1-0.25)^{(5-1)} \times 1\\ &+& \binom 52 0.25^2(1-0.25)^{(5-2)} \times 2\\ &+& \binom 53 0.25^3(1-0.25)^{(5-3)} \times 3\\ &+& \binom 54 0.25^4(1-0.25)^{(5-4)} \times 4\\ &+& \binom 55 0.25^5(1-0.25)^{(5-5)} \times 5\\ &=& 1.25\end{array}}$

The expected score for a student who guesses on all prompts is 1.25 out of 5 points.

Notice that the expected values in the examples are not possible outcomes of the random variable. In the first example, flame ratings are whole numbers 1-7 but the expected value is 3.39. In the second example, the random variable counts whole points on the quiz but the expected value is 1.25. The expected value can be thought of as a long term mean. That is, if the experiment were repeated many times (infinitely many!) the expected value is the mean outcome of all those trials.


Expected Value of a Continuous Random Variable

It is not possible to sum the probability weighted values of a continuous random variable since there are infinitely many. The analogous procedure is to integrate across the support of a continuous random variable and to use the probability density function to weight the outcomes.



$E(X) = \int_l^uxf(x)dx$




"Time headway" in traffic flow is the elapsed time between when one car finishes passing a fixed point and the instant that the next car begins to pass that point. Let $X$ denote the time headway, in seconds, for a randomly chosen pair of consecutive cars on a freeway. $f(x)=.15e^{-.15(x-.5)}$, for $x \geq 0.5$, and $f(x)=0$ otherwise. What is the expected value for seconds of time headway between any two consecutive cars on a freeway?


Time headway is a continuous random variable.

Use the formula $E(X) = \int_l^uxf(x)dx$, where $l=0.5$ and $u=\infty$ to find the expected value.

$\small{\begin{array}{lcl}E(X) &=& \int_{.5}^{\infty}x(.15e^{-.15(x-.5)})dx \\ &=& 0.15e^{0.075}\int_{0.5}^{\infty}xe^{-.15x}dx \\ &=& 0.15e^{0.075}[e^{-0.15x}(-6\frac 23x-44\frac 49)]_{.5}^{\infty}\\ &=& 0.15e^{0.075}[e^{-0.15(\infty)}(-6\frac 23(\infty)-44\frac 49)-e^{-.15(.5)}(-6\frac 23(.5)-44\frac 49)]\\ &=& 0.1617[0(-\infty)-0.9277(-3\frac 13-44\frac 49)) = .1617(-.9277(-47.7778))\\ &=& 7.1671\end{array}}$.

The expected seconds of time headway between any two consecutive cars on a freeway is 7.1671 seconds.


Toucans are most likely to be found near the equator, but can be found in other locations across the world. Let $X$ be a random variable indicating the latitude at which a toucan is spotted. Latitude $x=0$, the equator, is the most likely. A latitude of $x=-90$ denotes a toucan at the South Pole, and $x=90$ denotes a toucan at the North Pole.

Suppose the probability density function for $X$ is \[\small{ f(x) = \begin{cases} f(x)=\frac{1}{75}(\frac{x}{90}+1) & \text{for } & -90\leq x\leq 0\\ f(x)=\frac{2}{225}(1-\frac{x}{90}) & \text{for } & 0\leq x\leq 90\\ f(x)=0 & & \text{otherwise } \end{cases}} \] What is the expected latitude of a random toucan spotting?


X, the latitude of a random toucan spotting is a continuous random variable. The probability density function is a piecewise function. To find the expected value, treat each of the function pieces separately and sum the results.

$\begin{array}{lcl} E(X) &=&\int_{-90}^{0}x\frac{1}{75}(\frac{x}{90}+1)dx + \int_{0}^{90}x(\frac{2}{225})(1-\frac{x}{90})dx\\ &=&\int_{-90}^{0}\frac{x^2}{6750}+\frac{x}{75}dx + \int_{0}^{90}\frac{2x}{225}-\frac{2x^2}{20250}dx \\ &=&[\frac{x^3}{20250}+\frac{x^2}{150}]_{-90}^{0}+[\frac{x^2}{225}-\frac{x^3}{30375}]_{0}^{90} \\ &=&[(\frac{0^3}{20250}+\frac{0^2}{150})-(\frac{(-90)^3}{20250}+\frac{(-90)^2}{150})] +[(\frac{90^2}{225}-\frac{90^3}{30375})-(\frac{0^2}{225}-\frac{0^3}{30375})] \\ &=&[(0+0)-(-36+54)]+[(36-24)-(0-0)]\\ &=& -6\end{array}$

Then the expected latitude of a random toucan spotting is at -6 degrees.

The Variance of A Random Variable

As shown above, the expected score on the 5 prompt multiple choice quiz is 1.25 but a student won't score 1.25 every time (or ever) and many toucans will be spotted at latitudes other than -6 degrees. The variance of a random variable describes the spread of the outcomes of the random variable.

→ The variance of a random variable $X$ indicates how much the possible outcomes of the random variable vary around the expected value (mean).


NOTATION: The variance of random variable X is denoted $Var(X)$, $V(x)$, $\sigma^2_X$ or just $\sigma^2$.

The formula for computing variance of a random variable is the same whether the random variable is discrete or continuous. However, as with the expected value, the details of the process depend on the variable type.

Variance of a Random Variable
\(Var(X) = E[(X - E(X))^2] \)


In practice, we generally use the simpler to work with formula $Var(X) = E(X^2) - [E(X)]^2$.

Variance of a Random Variable
\(Var(X) = E(X^2) - [E(X)]^2] \)



$\small{\begin{array}{lcl} E[(X-E(X))^2] &=& E[X^2-2XE(X) + E(X)^2] & \\ &=& E(X^2)-2E(X)E(X) + E(X)^2] &\\ &=& E(X^2)-2E(X)^2 + E(X)^2] &\\ &=& E(X^2)-E(X)^2 & \end{array}}$
  • Expand the squared binomial.
  • Take the expected value of each term, $E(E(X)) = E(X)$.
  • Write $E(X)E(X)$ as $E(X^2)$.
  • Simplify to obtain the result.


Variance of a Discrete Random Variable

To find the variance of a random variable, compute the terms $E(X^2)$ and $[E(X)]^2$. $E(X^2)$ is a weighted mean of the squared values; it is computed like the expected value but the values are squared before being weighted and aggregated. The second term is the expected value squared.



$E(X^2) = \sum_{i=1}^np_ix_i^2$


The units of the variance are the square units of the context. This can be difficult to interpret (e.g. squared minutes or squared dollars). Thus, while many formulas work with variance, it is common to use the square root of the variance, the standard deviation, for interpretation. The units of the standard deviation are the same as the units of the context.

→ The standard deviation \(\sigma\), of a random variable $X$ is the positive square root of the variance.



At the HuHot Mongolian Grill, there are 1-flame, 2-flame, 3-flame, all the way up to 7-flame sauces to choose from (the number of flames indicates how hot the sauce is). Suppose that when customers come through the line, 17% of them choose a 1-flame sauce for their dish, 14% choose a 2-flame, 26% choose a 3-flame, 19% choose a 4-flame, 10% choose a 5-flame, 8% choose a 6-flame, and 6% choose a 7-flame. what is the variance for the flame rating of the sauce of the dish of a randomly selected HuHot customer?


Recall the probability mass function for the random variable describing flame rating:
x 1 2 3 4 5 6 7
P(X=x) 0.17 0.14 0.26 0.19 0.10 0.08 0.06

$Var(X)=E(X^2)-[E(X)]^2$, to find the variance, compute $E(X)$ and $E(X^2)$.

From a previous example, $E(X)=3.39$.

$\begin{array}{lcl}E(X^2) &=& \sum_{i=1}^np_ix_i^2 \\ &=& \sum_{i=1}^np_ix_i^2 \\ &=& 0.17(1)^2+0.14(2)^2+0.26(3)^2+0.19(4)^2+0.1(5)^2+0.08(6)^2+0.06(7)^2\\ &=& 14.43\end{array}$

$\begin{array}{lcl}Var(X)&=&E(X^2)-[E(X)]^2\\ &=& 14.43-(3.39)^2 \\ &=& 2.938\end{array}$

The variance for the flame rating $(\sigma^2)$ of a randomly selected HuHot dish is 2.938 "square flames".

The standard deviation is $\sigma=\sqrt{2.938}=1.714$.

The expected flame rate of a HuHot dish is 3.39, give or take 1.714 flames.


A 5-prompt multiple-choice quiz has four possible solutions for each prompt, one of which is correct. If a student randomly selects answers, there probability of choosing correctly is 0.25 for each prompt. The probability that a student gets all 5 prompts correct is 0.0977% . Let $X$ denote the number of prompts a randomly guessing student gets correct out of the 5. The probability distribution of X is given by the formula $\small{P(X = x) = \binom 5x 0.25^x(1-0.25)^{5-x}}$. What are the variance and standard deviation for the score?


$Var(X)=E(X^2)-[E(X)]^2$, to find the variance, compute $E(X)$ and $E(X^2).$

From a previous example, $E(X)=1.25$.

$\begin{array}{lcl}E(X^2) &=& \sum_{i=1}^np_ix_i^2 \\ &=& \binom 50 0.25^0(1-0.25)^{(5-0)} \times 0^2\\ &+& \binom 51 0.25^1(1-0.25)^{(5-1)} \times 1^2\\ &+& \binom 52 0.25^2(1-0.25)^{(5-2)} \times 2^2\\ &+& \binom 53 0.25^3(1-0.25)^{(5-3)} \times 3^2\\ &+& \binom 54 0.25^4(1-0.25)^{(5-4)} \times 4^2\\ &+& \binom 55 0.25^5(1-0.25)^{(5-5)} \times 5^2\\ &=& 2.5\end{array}$.

$Var(X) = 2.5-(1.25)^2=0.9375$ points2.
The standard deviation of $X$ is $\sigma=\sqrt{0.9375}=0.9682$ points.

The expected score is 1.25 points, give or take 0.9682 points.

Variance of a Continuous Random Variable

As in the discrete case, the variance of a continuous random variable is computer with this formula: $Var(X) = E(X^2) - [E(X)]^2$. The second term is the square of the expected value. The computations for the first term $E(X^2)$ are similar to those for computing the expected value but the 'x' term is squared before being multiplied by the density function.

For a continuous random variable, X, with pdf $f(x)$ on $l\leq x \leq u$
\(E(X^2) = \int_l^u x^2f(x)dx \).


In general, \(E(g(X)) = \int_l^ug(x)f(x)dx\).


"Time headway" in traffic flow is the elapsed time between when one car finishes passing a fixed point and the instant that the next car begins to pass that point. Let $X$ denote the time headway, in seconds, for a randomly chosen pair of consecutive cars on a freeway during a period of heavy flow. Then $f(x)=.15e^{-.15(x-.5)}$, for $x \geq 0.5$, and $f(x)=0$ otherwise. What is the expected value for seconds of time headway between any two consecutive cars on a freeway with heavy flow?


$Var(X)=E(X^2)-[E(X)]^2$.

From a previous example, $E(X)=7.1671$ seconds.

$\begin{array}{lcl}E(X^2) &=& \int_l^ux^2f(x)dx\\ &=& \int_{.5}^{\infty}x^2(.15e^{-.15(x-.5)})dx\\ &=& .15e^{.075}\int_{.5}^{\infty}x^2e^{-.15x}dx\\ &=& .15e^{.075}([\frac{-20}{3}x^2e^{-0.15x}]_{0.5}^{\infty}+\frac{40}{3}\int_{0.5}^{\infty}xe^{-0.15x}dx)\\ &=& .15e^{.075}([\frac{-20}{3}x^2e^{-0.15x}]_{0.5}^{\infty}+\frac{40}{3}([\frac{-20}{3}xe^{-0.15x}]_{0.5}^{\infty}+\frac{20}{3}\int_{0.5}^{\infty}e^{-0.15x}dx))\\ &=& .1617[\frac{-20}{3}x^2e^{-0.15x}-\frac{800}{9}xe^{-0.15x}-\frac{16000}{27}e^{-0.15x}]_{0.5}^{\infty}\\ &=& .1617(\frac{-20}{3})[e^{-0.15x}(x^2+\frac{40}{3}x+\frac{800}{9})]_{0.5}^{\infty}\\ &=& -1.0779[(e^{-0.15(\infty)}(\infty^2+\frac{40\infty}{3}+\frac{800}{9}))-(e^{-0.15(.5)}(.5^2+\frac{40(.5)}{3}+\frac{800}{9}))]\\ &=& -1.0779[0(\infty)-0.4724(\frac 14+\frac{20}{3}+\frac{800}{9})]\\ &=& -1.0779(-0.4724(95.8056))\\ &=& 95.8056\end{array}$.

$Var(X)=\sigma^2=95.8056-(7.1671)^2=44.44$ seconds2.

The standard deviation is $\sigma=\sqrt{44.44}=6.6663$ seconds.

The expected time headway between any two consecutive cars on a freeway with heavy flow is 7.1671 seconds, give or take 6.6663 seconds.


Toucans are most likely to be found near the equator, but can be found in other locations across the world. Let $X$ be a random variable indicating the latitude at which a toucan is spotted. Latitude $x=0$, the equator, is the most likely. A latitude of $x=-90$ denotes a toucan at the South Pole, and $x=90$ denotes a toucan at the North Pole.

Suppose the probability density function for $X$ is \[ f(x) = \begin{cases} f(x)=\frac{1}{75}\left(\frac{x}{90}+1\right) & \text{for } & -90\leq x\leq 0\\ f(x)=\frac{2}{225}\left(1-\frac{x}{90}\right) & \text{for } & 0\leq x\leq 90\\ f(x)=0 & & \text{otherwise } \end{cases} \] What are the variance and standard deviation of the latitude of a random toucan spotting?


X, the latitude of a random toucan spotting is a continuous random variable. The probability density function is a piecewise function. To find the expected value, treat each of the function pieces separately and sum the results.

$Var(X)=E(X^2)-[E(X)]^2$.

From a previous example, $E(X)=-6$ degrees.

$\begin{array}{lcl}E(X^2) &=& \int_l^ux^2f(x)dx\\ &=&\int_{-90}^{0}x^2\left(\frac{1}{75}\right)\left(\frac{x}{90}+1\right)dx + \int_{0}^{90}x^2\left(\frac{2}{225}\right)\left(1-\frac{x}{90}\right)dx\\ &=& \int_{-90}^{0}\frac{x^3}{6750}+\frac{x^2}{75}dx + \int_{0}^{90}\frac{2x^2}{225}-\frac{2x^3}{20250}dx\\ &=& \left[\frac{x^4}{27000}+\frac{x^3}{225}\right]_{-90}^{0}+\left[\frac{2x^3}{675}-\frac{x^4}{40500}\right]_{0}^{90}\\ &=& \left[\left(\frac{0^4}{27000}+\frac{0^3}{225}\right)-\left(\frac{(-90)^4}{27000}+\frac{(-90)^3}{225}\right)\right] +\left[\left(\frac{2(90)^3}{675}-\frac{(90)^4}{40500}\right)-\left(\frac{2(0)^3}{675}-\frac{0^4}{40500}\right)\right]\\ &=& (0+0)-(2430-3240)+(2160-1620)-(0-0)\\ &=& 1350\end{array}$

$Var(X)=\sigma^2=1350-(-6)^2=1314$ degrees2.

The standard deviation of $X$ is $\sigma=\sqrt{1314}=36.25$ degrees.

The expected latitude of a random toucan spotting is at -6 degrees, give or take 36.25 degrees.

Interpretation of the Variance

The variance is the mean of the squared distances of the possible values of a random variable from their mean.
  • If one observation is sampled at random from a distribution with high variance, there is a high probability that the observation will be far from the mean.
  • If one observation is sampled at random from a distribution with low variance, there is a small probability that the observation will be far from the mean.


Consider two gambling games. For both, the expected value of winnings (expected winnings) is -10 cents. If a gambler played either game wmany times, they would lose about 10 cents per game, on average.

The variance of the winnings playing game 1 is $\$^{2}5,500$, and the variance of the winnings playing game 2 is $\$^{2}700$.

Which game should you choose to play if your paramount strategy is...
  • to try to win the largest amount possible?
  • to risk losing as little as possible?
If your strategy is to win the largest amount possible, you should play Game 1. Game 1 has a much higher variance, which means getting a winnings value much higher than the expected value is much more probable. This probably also means you'll be risking bigger losses, but it does maximize your chance of getting a big payout.

If your strategy is to risk losing as little as possible, you should play Game 2. Game 2 has a much lower variance, which means that if your winnings end up lower than the expected value, they are more likely to be close to that expected value. This also means that you probably can't expect as big of a payout from the game when you get a payout, but your losses will stay a lot closer to 10 cents per attempt in Game 2.


On a certain track team, the runners all take between 4 and 7 minutes to finish a mile. The probability density function for the length of time it takes a runner to finish a mile is $f(x)=\frac{4}{21}(x-4.5)^2, 4\leq x \leq 7$.

Find the expected value and variance of the time it takes a randomly selected member of the team to run a mile.


$\begin{array}{lcl} E(X)&=& \int_4^7 \frac{4}{21}x(x-4.5)^2 dx\\ &=& 6.357 \end{array}$ minutes .

$\begin{array}{lcl} E(X^2)&=& \int_4^7 \frac{4}{21}x^2(x-4.5)^2 dx\\ &=& 40.686 \end{array}$.

$\begin{array}{lcl} Var(X)&=& E(X^2) - (E(X))^2\\ &=& 40.686 - 6.357^2 \\ &=& 0.275\end{array}$ minutes2.