All knowledge degenerates into probability.

Top

Site Menu

Discrete Random Variables

Colorblindness is caused by a recessive gene on the X chromosome. Since men have only one X chromosome, if a man carries the colorblindness allele (gene form), he will have the trait. Women have two X chromosomes so, for a woman to be colorblind, she must inherit the allele from both parents. Suppose both parents carry one colorblindness allele. Let X (pun intended!) denote the number of colorblindness alleles their daughter inherits. X is a discrete random variable.

Click on the picture to take the Ishihara test for colorblindness.

A color vision test image consisting of many small colored circles. The background circles are shades of orange, while green and pink circles form the number “3” in the center.

A discrete random variable maps the outcomes of a chance process to a countable set of numbers. Variables corresponding to outcomes that are are counted are discrete. Values might be fractional and there might be infinite possibilities, however, between any two observable values there is a gap.



A woman can inherit 0, 1, or 2 of the allese for colorblindness from her parents. Let X be a random variable that indicates the number of alleles a randomly chosen woman inherits from parents who are both carriers.

X is always 0, 1, or 2. There cannot be 0.5 or 1.7 inherited alleles.

X is a discrete random variable.

The Distribution of a Discrete Random Variable

Tables, graphs, and formulas can describe the distribution of a discrete random variable. These indicate the possible values of the random variable along with the associated probabilities. It is sometimes useful to state specific probabilities, and sometimes cumulative probabilities.



The tables and graphs below all contain the same information, the distribution of X, the number of alleles for colorblindness carried by a daughter of two carriers of the colorblindness allele.

In Table 1 and Graph 1, the possible values are indicated along with the probabilities of observing those values exactly. This is called the Probability Mass Function (pmf). Notice that probability is associated with the values 0, 1, and 2 only.

In Table 2 and Graph 2, the possible values of the random variable are indicated along with the probabilities of observing a value no greater than the indicated value. This is the Cumulative Distribution Function (cdf).

Two tables comparing probability functions. Table 1:** The probability mass function shows values of x = 0, 1, 2 with corresponding probabilities P(X = x) of ¼, ½, and ¼. Table 2:** The cumulative distribution function shows values of x = 0, 1, 2 with corresponding probabilities P(X ≤ x) of ¼, ¾, and 1. Two side-by-side graphs. Graph 1:** The Probability Mass Function shows three vertical bars at x = 0, 1, and 2 with heights of 0.25, 0.5, and 0.25, respectively. Graph 2:** The Cumulative Distribution Function shows a step graph starting at 0, increasing to 0.25 at x = 0, 0.75 at x = 1, and 1 at x = 2. Both graphs display probability on the y-axis from 0 to 1.

A function that indicates the observable values of a discrete random variable and their associated probabilities is called a probability mass function. A function that indicates the observable values of a discrete random variable and associated cumulative probabilities is called the cumulative distribution function.


The Probability Mass Function

The probability mass function (pmf) describes the distribution of a discrete random variable in terms of the possible outcomes and the specific associated probabilities. The pmf can be given as a table, a graph, or a formula.



Consider a raffle of 20 tickets. 6 tickets are drawn for prizes. The one first prize winner gets $\$20$, 2 second prize winners get $\$10$, and three third prize winners get $\$5$. Let X denote the amount of money a person could win with a raffle ticket. The table and the graph display the probability mass function of X.

Table showing the probability mass function for a discrete random variable X. The values of X are 0, 5, 10, and 20, with corresponding probabilities P(X = x) of 0.7, 0.15, 0.1, and 0.05. Bar graph showing the probability mass function for a discrete variable X with values 0, 5, 10, and 20. The vertical bars indicate probabilities of 0.7 at X = 0, 0.15 at X = 5, 0.1 at X = 10, and 0.05 at X = 20. The probabilities decrease as X increases.

$\rightarrow$ The probability mass function of a discrete random variable indicates the possible values of the random variable and the probabilities of observing those values.


NOTATION
The probability mass function is denoted as 'P(X = x)' or just 'P(x)'.



The probability mass function for X, the number of alleles for colorblindness carried by a daughter of two carriers of the colorblindness allele, is shown in the table and the graph.

Table labeled “Table 1: The probability mass function.” It shows x values 0, 1, and 2, with corresponding probabilities P(X = x) of ¼, ½, and ¼. Bar graph labeled “Graph 1: The Probability Mass Function.” It shows three vertical bars representing probabilities for x = 0, 1, and 2. The bar heights are 0.25 at x = 0, 0.5 at x = 1, and 0.25 at x = 2, showing that the probability is highest when x = 1.

All the values that the random variable can assume are identified in probability mass function. The probability associated with a value other than these is 0. The sum of the probabilities is 1.


For a discrete random variable, X, with values x or x1,x2...xn
  1. All probabilities are between 0 and 1: $\small{0 \leq P(X=x_i) \leq 1}$.
  2. The sum of the probabilities of all outcomes is 1: $\small{\sum_{i=1}^nP(X=x_i) = 1}$.


The Cumulative Distribution Function

The cumulative distribution function (cdf) describes the distribution of a discrete random variable in terms of the possible outcomes and the associated cumulative probabilities. A cumulative probability indicates that probability that the random variable takes on a value no larger than the corresponding value. The cdf can be given as a table, a graph, or a formula.



Consider a raffle of 20 tickets. 6 tickets are drawn for prizes. The one first prize winner gets $\$20$, 2 second prize winners get $\$10$, and three third prize winners get $\$5$. Let X denote the amount of money a person could win with a raffle ticket. The table and graph display the cumulative distribution function of X.

Table showing the cumulative distribution function for a random variable X. The x values are 0, 5, 10, and 20, with corresponding cumulative probabilities P(X ≤ x) of 0.7, 0.85, 0.95, and 1. Step graph showing the cumulative distribution function (CDF) of a discrete variable X. The curve starts at 0, jumps to 0.7 at X = 0, rises to 0.85 at X = 5, to 0.95 at X = 10, and reaches 1 at X = 20. The line remains flat between these points, with arrows extending left and right to indicate the function continues beyond the plotted range.

$\rightarrow$ The cumulative distribution function of a discrete random variable indicates each of the possible values of the random variable, and the probability of observing anything equal to or less than each of the values.


NOTATION
The cumulative distribution function for a discrete random variable is denoted as 'P(X ≤ x)'.



The cumulative distribution function for X, the number of alleles for colorblindness carried by a daughter of two carriers of the colorblindness allele, is shown in the table and the graph.

Table labeled “Table 2: The cumulative distribution function.” It shows values of x as 0, 1, and 2, with corresponding cumulative probabilities P(X ≤ x) of ¼, ¾, and 1. Step graph labeled “Graph 2: The Cumulative Distribution Function.” The green curve shows a step pattern: it rises to 0.25 at x = 0, to 0.75 at x = 1, and to 1 at x = 2. The graph starts near zero below x = 0 and remains flat between steps.

The cumulative distribution function is 0 for any value less than the smallest observable value (0 in this case). In other words, there is no probability that a value less than the smallest possible value would be observed.

The cdf increases only at the locations of the possible values and does not decrease. Between observable values, the cdf is constant. The random variable in this example can only assume the values 0, 1, and 2. Thus P(X ≤ 1) = P(X = 0) + P(X = 1). P(X ≤ 1.5) = P(X = 0) + P(X = 1) as well since 0 and 1 are the only observable values that are less than 1.5.

The probability that the random variable is less than or equal to 2 is 1 since 0, 1, and 2 are all less than or equal to 2. For any value x that is greater than 2, P(X \leq x) = 1 since 0, 1, and 2 are all less than that value.

Properties of the cumulative distribution function with minimum value L and maximum U:
  1. If x is less than L, the cdf evaluated a x is 0.
  2. If $\small{a}$ and $\small{b}$ are two values such that $\small{a < b}$ then $\small{P(X \leq a) \leq P(X \leq b)}$. (The cdf is non-decreasing.)
  3. If x is greater than U, the cdf evaluated a x is 1.

  1. When drawing 2 balls from the jar, there can be 0 red balls, 1 red ball, or 2 red balls. The sample space for X is S = {0,1,2}. Use the tree diagram to find the probabilities of each.:

    Probability tree diagram showing outcomes for drawing red (R) or blue (B) balls. The first split shows R with probability 3/5 and B with probability 2/5. From R, the branches split again into R (2/4) and B (2/4), leading to final probabilities of R = 3/10 and B = 3/10. From B, the branches split into R (3/4) and B (1/4), giving final probabilities of R = 3/10 and B = 1/10. Red labels indicate R outcomes, and blue labels indicate B outcomes.

    The probability of drawing 2 red balls is 3/10, the probability of drawing 1 red ball is 3/10+3/10 = 3/5, and the probability of drawing 0 red balls is 1/10. This produces the probability mass function shown.

    Table showing the probability mass function for a discrete random variable X. The x values are 0, 1, and 2, with corresponding probabilities P(X = x) of 1/10, 6/10, and 3/10.



  2. P(X ≤ 1) = P(X = 0) + P(X = 1) = 1/10 + 3/5 = 7/10.
    Or: P(X ≤ 1) = 1 - P(X = 2) = 1 - 3/10 = 7/10.


  3. P(X ≤ 0) 1/10.
    P(X ≤ 1) = P(X = 0) + P(X = 1) = 1/10 + 3/5 = 7/10.
    P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 1) = 1/10 + 3/5 +3/10 = 1.

    Table showing the cumulative distribution function for a discrete random variable X. The x values are 0, 1, and 2, with corresponding cumulative probabilities P(X ≤ x) of 1/10, 7/10, and 1.

  4. From the cdf table, P(X ≤ 1) = 7/10