Discrete Random Variable Distribution Families
In the board game "Chick-a-Pig," a cow that sits in the center of the board and creates an obstacle for players
attempting to move across the board. On each turen, a player rolls a die to determine how many moves to make. The cow
can only be moved when someone rolls a 1. Three players recently played an entire game without ever
being able to move the cow. To win the game, a player must move 6 pigs across the board. So a conservative estimate of the number of
times the die was rolled is 18 (6 moves per person). What is the
probability of rolling a fair die 18 times without ever observing a 1?
A random variable that counts the number of 1's in 18 dice rolls is similar to one that counts the number
of heads in 100 coin tosses or the number of females in 50 births. The probability functions for these are very similar and are, in fact,
in the same family.
A family of distributions is a collection of probability mass or probability density functions that differ only with respect to
one or more parameters, or numeric quantities that describe the distribution. There are both discrete and continuous families of distributions.
Identifying the family that a distribution belongs to is helpful for understanding characteristics of the probability function.
A family of distributions
A parameter of a distribution is a numeric quantity that describes a defining aspect of the distribution.
The binomial family of distributions has two parameters: $n$, a fixed number of trials or repetitions of a random event, and $p$, the probability of observing a desired outcome on each trial. This family of distributions describes discrete random variables that count the number of successes in $n$ independent trials:
- The number of 1's in 18 rolls of a fair die ($n=18$, $p=\frac{1}{6}$).
- The number of heads in 100 coin tosses ($n=100$, $p=0.5$)
- The number of females in 50 births ($n=50$, $p\approx 0.5$)
The normal family of distributions has two parameters: $\mu$, the mean of the distribution, and $\sigma^2$, the variance. This family of distributions describes continous random variables that have bell-shaped distributions - that is, values near to the mean are most likely, values very different from the mean are possible but less likely.
- The IQ of a randomly chosen person, $\mu=100$, $\sigma^2=225.$
- The length of human pregnancy, $\mu=266$, $\sigma^2=256.$
- The height in inches of adult women in the US, $\mu = 65$, $\sigma^2 = 9.$
If the distribution of a random variable belongs to the binomial family, the random variable has a binomial distribution. If the
distribution belongs to the normal family, the random variable has a Normal distribution.
The notation 'X~' is read "X is distributed as". This is typically followed by the name of a distribution with the values of the parameters
indicated. For example 'X ~ binomial(n, p)' is ready "X has a binomial distribution with parameters n and p.
There are many other families of random variables both discrete and continuous. Several of the more common discrete random variables will be
introduced on this page, followed by continuous distributions on the next.
NOTATION: 'X ~ ' reads "X is distributed as"
The Bernoulli Distribution
The Bernoulli Distribution arises in a simple experimental situation
consisting of a single trial results in a success (the outcome of interest is observed) or a failure (the outcome of interest is not observed).
The value of the random variable is 1 if the a success is observed and 0 otherwise.
The parameter for a Bernoulli distribution is $p$, which denotes the probability of success.
Success: the outcome of interest is observed.
Failure: the outcome of interest is not observed.
Bernoulli Trial: A random event that results in a success or a failure.
A student tosses a coin. Let X=1 if the result is heads and 0 if tails. $X\sim Bernoulli\left(\frac{1}{2}\right)$.
$P(X=x)=p^x(1-p)^{(1-x)}$, for $x=0,1$
A child playing a board game rolls a die. Let X=1 if the result is a 6 and 0 if not. $X\sim Bernoulli\left(\frac{1}{6}\right)$.
- Give the probability mass function of X.
- Use the pmf to find the probability that X = 0.
- \(P(X=x) = \frac{1}{6}^x\left(1-\frac{1}{6}\right)^{1-x}\) for x = 0, 1.
- \(P(X=0) = \frac{1}{6}^0\left(1-\frac{1}{6}\right)^{1-0} = 1\left(\frac{5}{6}\right)^1 = \frac{5}{6}\)
The probability mass function of a Bernoulli random variable is simple, but it is a building block of more complex distributions.
- Number of Trials: 1
- Minimum Value: 0
- Maximum Value: 1
- E(X) = p
- Var(X) = p(1-p)
- X = 0 for failure, X = 1 for success.
The Binomial Distribution
The Binomial Distribution arises when a random process is repeated independently a fixed number of times ($n$), each trial
results in a success (1) or a failure (0), and the probability of success ($p$) is constant. A Binomial random variable counts the number
of successes in the $n$ trials and is a sum of $n$ Bernoulli Random Variables.
The parameters for a Binomial distribution are the number of trials ($n$) and the probability of success ($p$).
A trial is a single occurrence of a random process.
A Bernoulli trial is a single occurrence of a random process that results in either success or failure.
A woman tosses a coin 10 times. Let X be the number of heads in the 10 tosses. \(X\sim Binomial\left(10,\frac{1}{2}\right)\).
$P(X=x)=\binom nx p^x(1-p)^{(n-x)}$, for $x = 0, 1, 2, \ldots, n$
A child rolls a six-sided die 20 times, hoping for 6's. What is the probability that she observes ten 6's in the 20 rolls? Let X indicate the number of 6's in the 20 rolls. \(X\sim Binomial\left(20, \frac{1}{6}\right)\).
- State the probability mass function of X.
- Use the pmf to find the probability that X = 10.
- \(P(X=x) = \binom{20}{x} \frac{1}{6}^x\left(1-\frac{1}{6}\right)^{20-x}\) for x = 0, 1.
- \(P(X=10) = \binom{20}{10} \frac{1}{6}^{10}\left(1-\frac{1}{6}\right)^{20-10} = 0.000493\)
- Number of Trials: $n$
- Minimum Value: 0
- Maximum Value: $n$
- $E(X) = np$
- $Var(X) = np(1-p)$
- X counts successes in $n$ independent trials
In 2013, a kindergarten in Pennsylvania, USA included 10 sets of twins. The CDC reports about 3.2% of births result in twins. If the students at the kindergarten are from 90 separate births, let X denote the number of those that are twins. ('Births' is used here rather than 'children' since the children in a twin pair are not independent).
- Explain why X has a Binomial distribution.
- Find the probability that X = 10.
- Find the probability that X ≥ 10.
In the board game "Chick-a-Pig," a cow that sits in the center of the board and creates an obstacle for players attempting to move across the board. On each turn, a player rolls a die to determine how many moves to make. The cow can only be moved when someone rolls a 1. Three players recently played an entire game without ever being able to move the cow. To win the game, a player must move 6 pigs across the board. So a conservative estimate of the number of times the die was rolled is 18 (6 moves per person).
What is the probability of rolling a fair die 18 times without ever observing a 1?
The Geometric Distribution
The Geometric Distribution counts the number of independent Bernoulli trials until a success occurs. It has one parameter, $p$, the probability of success.
A man tosses a coin until a head occurs. Let X be the number of times the man tosses the coin. \(X\sim Geometric(0.5)\)
$P(X=x)=p(1-p)^{(x-1)}$, for $x = 1, 2, \ldots$
A teacher has a class of 10 children. She has written each of their names on a popsicle stick that she keeps in a jar. When she wants to call on someone, she randomly draws a stick from the jar and returns it to the jar when she is done. Let X indicate the number of times the teacher calls on a student until Zoe is chosen.
- Explain why X has a geometric distribution.
- Find the probability that Zoe isn't called on until the 15th draw.
- Find the probability that Zoe is called on before the 15th draw.
\(P(X\leq x ) = 1- (1-p)^{x}\) for $x = 1, 2, \ldots$
A teacher has a class of 10 children. She has written each of their names on a popsicle stick that she keeps in a jar. When she wants to call on someone, she randomly draws a stick from the jar and returns it to the jar when she is done. Let X indicate the number of times the teacher calls on a student until Zoe is chosen. Use the CDF to find the probability that Zoe is called on before the 15th draw.
- Number of Trials: variable, until the first success occurs
- Minimum Value: 1
- Maximum Value: no upper limit
- $E(X) = \frac{1}{p}$
- $Var(X)=\frac{1-p}{p^2}$
- X counts the number of trials until the first success.
The geometric distribution possesses the memoryless property. That is, the probability distribution remains the same regardless of what has already occurred. For example, the probability of observing a success on the 3rd trial is the same as the probability of observing a success on the 3rd trial given that 5 trials have already occurred. It doesn't matter how many trials have already occurred; the probability that a success will come in exactly 3 more trials will always be the same.
A random variable, X, is memoryless if $P(X > a+b|X>a) = P(X>b)$.
The Poisson Distribution
The Poisson Distribution counts the number times a rare event occurs in a given unit of distance, volume, or time. It has one parameter, $\lambda$, the mean number of occurences per specified unit.
Janice has a bird feeder in her backyard. Many types of birds come at various times to eat from the feeder. A bird lands on the feeder, on average, 6 times in a given hour. Let X denote the number of times a bird lands on the feeder in an hour. $X\sim Poisson(6)$.
$P(X=x)=\frac{\lambda^x e^{-\lambda}}{x!}$, for $x = 0, 1, 2, \ldots$
Janice has a bird feeder in her backyard. Many types of birds come at various times to eat from the feeder. A bird lands on the feeder, on average, 6 times in a given hour. Let X denote the number of times a bird lands on the feeder in an hour.
- Explain the Poisson distribution is reasonable for X.
- Find the probability that 3 birds visit the birdfeeder in an hour.
- Find the probability that fewer than 3 birds visit the birdfeeder in an hour.
The Poisson distribution is easily scalable, that is, if the mean of the distribution is λ per 1 specified unit, then the mean is 2λ per 2 specified units and 0.5λ per half unit. For example, if 6 birds are expected to visit the bird feeder in 1 hour, then 12 are expected in 2 hours.
Janice has a bird feeder in her backyard. Many types of birds come at various times to eat from the feeder. A bird lands on the feeder, on average, 6 times in a given hour. Find the probability that 10 birds visit the feeder in 3 hours.
When a liquid solution with a pathogenic bacterium has an initial cell count of approximately 109 CFU/ml ("Colony-Forming Units" estimate the number of viable bacteria in a sample), and undergoes a 10-fold dilution of bacterial cultures, chemists observe an average of 1 CFU/2 μl (one colony-forming unit per microliter) left in the solution after a 24-hour incubation period.
What is the probability that, after putting a pathogenic bacterium solution through a 10-fold dilution of bacterial cultures and then leaving it for a 24-hour incubation period, that in 6 microliters of solution, there are 4 colony-forming units?
- Minimum Value: 0
- Maximum Value: no upper limit
- $E(X) = \lambda$
- $Var(X)=\lambda$
- X counts occurrences of a rare event per unit of time or space.