Homework 10 (20 points) - due 05/31/00
Note that this was previously posted as Homework 9, Part III. However, more details are given here, the due date has been extended by one day, and the number of points is now 20.
Problem 1:
Recall the exercise we tried
in class in Lecture 10 (05/23). We used the student
data available on
http://www.math.usu.edu/~vukasino/teaching/spring2000/complab/student_data1.prn
and used the JavaScript version of Rweb to
obtained summary statistics for the variables
Age and Siblings.
Now try yourself to calculate a few more summary
statistics such as the median and the variance of
Height and Weight. First make sure to identify the
appropriate columns of the matrix X that represent
these two variables. Can you also calculate the correlation
between these two variables and draw a scatterplot of
Height (horizontal) vs. Weight (vertical)?
Problem 2:
Also, recall how we calculated
simple linear regression of Siblings on
Age. Now try yourself to calculate a simple linear
regression where Weight is the response and Age
is the predictor variable. The required syntax is
result <- lm(response ~ predictor). Here,
result <- means that we assign the outcome of the
calculation right of <- to a new variable called
result. lm represents a function that calculates
a linear model. response ~ predictor represents the
expression that should be calculated. You have to replace
response and predictor with the appropriate
columns of the matrix X. Finally, you have to produce
some visible output using the command summary(result).
Homework problems
Using Rice's Data Analysis Lab Package, solve the problems listed below. Describe exactly all steps needed to obtain the required results (including data editing/recoding). If you encounter any problem (e.g., if a certain applet is not working or if you can clearly see that the results are wrong) mention this explicitely in your written report.
Using the data set on Labor force available as a sample data set in the site:
Homework Problem: Open Statlets, load the data. After speaking to the person who put down that they were 8'7" we found out that they actually wrote 5'7" but the 5 looked like an 8. Find this outlier in the data in the Statlets data screen and change 103.0" to 67.0". What is the 95% confidence interval? Then do a t-test on Ho: m = 68". Do we reject at 5%? Give another reason why you know this to be the case. For what value of m would we not reject at 5%?
Homework: