Stat 1040
Chapter 20 Solutions
-
- 50,000.
- Each ticket in the box shows a zero (for each form with gross income
less than or equal to $50,000) or a one (for those forms with gross income
over $50,000).
- False. The SD is actually (1-0)*sqrt(0.20 * 0.80) = 0.4.
- True.
- The number of audited forms in the sample with income over $50,000 is
like the sum of 900 draws from the box described in 3.b. The EV for this
box is 180, and the SE is sqrt(900) * {SD of the box} = 12. As a percentage
of the number of draws, the EV is 20% and the SE is 1.3%. This means that
we expect the number of audited forms in the sample with income over $50,000
to be 20%, give or take 1.3% or so. The chance that the percentage of the
sample falls between 19% and 21% is equivalent to finding the area under the
normal curve from -0.75 to +0.75 which is about 55%.
- We have no way of calculating this chance. We need to know the
percentage of forms that have gross income over $75,000 in order to find an
expected value and standard error. Besides all that, the data will probably
not be normal, and our methods won't be valid anyway.
-
The box we'll use in this question does not have zeroes and ones in it.
Sampling 900 tax forms for the total gross income of all 50,000 forms is
like finding the sum of 900 draws from a new box which has a gross income on
each ticket -- one ticket for each tax form -- making 50,000 tickets in
all. We know from the information given in the problem that the average of
this box is $37,000 and the SD is $20,000.
- 50,000 again.
- a gross income.
- True.
- True.
- The chance that the total gross income is over $30,000,000 is the same
as the chance that our sample sum is over $30,000,000. We expect our sample
sum to be (900)*($37,000) = $33,300,000. The SE of our sample sum is
sqrt(900)*($20,000)=$600,000. So the chance of our sample sum being over
$30,000,000 is equivalent to the area under the normal curve to the right of
-0.5 which is about 70%.
-
Statement (ii) is correct. The accuracy of California's sample will be
higher because California's sample will be larger.
-
Statements (a), (c) and (e) are true, while (b), (d) and (f) are false.
For (b) and (f), we know exactly the expected value for the percentage of
1's among the draws, and we know exactly the percentage of one's in the
population. We only attach measures of chance error to quantities we don't
know, such as the percentage of one's that come up in a sample we happen to
take. That amount will be different for each sample, whereas the values
mentioned in (b) and (f) don't change from sample to sample.