In God we trust, all others bring data.

Top

Site Menu

Measure of Center

According DQYDJ the mean net worth (assets - liabilities) of an American household in 2020 was $746,821. However, the median net worth of US households is only $121,411.

The mean and the median are both 'measures of center,' that is they are indicators of the middle of a dataset. Why is there such a big difference between them?

A scattered pile of one dollar bills.

A measure of center indicates the location of the 'middle' of a dataset. Common measures of center are the mean, the median, and the mode. The term 'average' was historically used to indicate any measure of the center of the dataset. In other words, 'average' could refer to the mean, the median, or the mode of the data or even to something less specific. Currently, 'average' is commonly used interchangeably with 'mean'.

A measure of center describes the middle of a datset.

The mean of a list is found by adding each of the numbers in a list and dividing by the size of the list. Equivalently, the mean can be found by summing the distinct values in a list weigted by the proportion of occurrences. Extreme values in the list (very big or very small relative to the rest of the data) pull the mean toward them.

The median of a list is the middle number, that is, half of the values in the list are smaller than the median and half are larger. Extreme values in the list do not affect the median.

If the distribution of the data is symmetric, the mean and the median are the same.

The mean and median are equal when the data distribution is symmetric.

The histogram, based on percentiles obtained from DQYDJ, shows the approximate distribution of the net worth of US households.

A histogram titled “US Household Net Worth.” The x-axis is labeled “Net Worth, in Millions” and ranges from 0 to 12 million dollars. The y-axis shows the frequency of households. Most of the distribution is concentrated at very low net worth values near zero, represented by a tall turquoise bar on the left side of the graph, with the bars rapidly declining as net worth increases. The mean net worth (star) in 2020 was $\small{\$}$746,821 but the median (triangle) was much lower, only $\small{\$}$121,411 (though the distance may look small on the graph, keep in mind that it is more than half a million dollars!)

Most households have a net worth relatively close to zero (many are even negative indicating that the households owe more than they own). However, some Americans have so much money that the weight their net worths contribute to the mean pulls it up dramatically. In fact, to include the richest man in the world in 2020, Jeff Bezos, on the graph with his net worth of around $\small{\$}$172 billion, the graph would need to be nearly 15 THOUSAND times as long as the one shown! If the current axis is about 4 inches long, it would need to be extended nearly a mile to include the net worth of the richest Americans.


The applet has two tabs for learning about the mean and the median.

Mean

The mean is the most commonly used measure of center and is generally what is meant when people use the term 'average'. The mean is found by summing the values in a list and dividing by the number of values. In other words, if there are $n$ numbers in a list, the mean is found by summing them and dividing by n.


$\bar{X} = \frac{X_1+X_2+\cdots + X_n}{n}$


A mean is sometimes interpretted in terms of a histogram. That is, the mean of a list is the 'balance point' of a histogram of those same data. To see this, use the applet above to create a histogram and then choose to show the mean. The histogram would appear to balance at the mean.

The mean can also be thought of as the "fair share" value. That is, the net worth of all US households were aggregated and then divided equally among all US households, each household would end up with a net worth of about $692,100. That would be great news for the vast majority of Americans but a drastic come down for some of the richest.


Median

The median is the middle point in the dataset. That is, exactly half the values are greater and half are less than the median. If the net worth of every house hold in the US with the exception of Jeff Bezos' stayed the same and his increased to 300 billion or 500 billion, the median would be unchanged. The weight of the extreme value, or its distance from the median has no effect on the median itself.


The middle value of an ordered list of numbers. If there is an even number of values in the list, the median is the mean of the middle two.


Example: The numbers of pages in 3 textbooks are 578, 720, and 290. Find the mean and the median page numbers. A fourth textbook has 397 pages. Find the mean and median.

Mean, Median, and Outliers

Looking at the net worth data, it is clear that the mean is sensitive to outliers or extreme values. That is, very large or very small values have a strong impact on the mean. In contrast, the median is not impacted by extreme values. The mean net worth of US households in 2020 was about $\$$746,821 and the median was $\$$121,411. If Jeff Bezos' income increased by $\$$100 billion (or Elon Musk passed him as the richest man as has now happened) but all else stayed the same, the mean household income would increase but the median household income would be unchanged. Thus the mean is best used as a measure of center when the distribution of the data is approximately symmetric.

Example: Suppose the heights of three boys are 48 inches, 65 inches and 67 inches. What are the mean and median heights? If the tallest boy grows a foot, what happens to the mean and median? (The boys' heights are now 48 inches, 65 inches, and 79 inches.) Note that the median would change if it were the shortest boy that grew rather than the tallest, but it would still be the middle value regardless of how much distance there is between the boys' heights.