Stat 5810, Applied Spatial Statistics
Project 2 (11/27/00)
30% of your course grade - Due Mon 12/11/00 5pm
In this project, you have to show that you can handle a full spatial
data set. You should include
a visualization, an exploration, and a modeling component, using
ArcView/XGobi and S-Plus/SpatialStats.
You have to write a full report on your analysis and your results.
Describe the techiques you are using, summarize meaningful
results, and include useful graphics (please no more than
6 graphics on one printed page). Overall,
the main part of your report
must not exceed 20 pages. However, you should include printouts
of S-Plus sessions, S-Plus code you have developed, intermediate
graphics that lead you to a particular assumption, etc. into
an appendix.
You should work in the same 2 groups as for Project 1.
Group A: Analyze the "South American climate" data
presented in Bailey/Gatrell. Some particular questions are:
Do you end up with the
same clusters as presented in the book
when using the grand tour in XGobi
(do NOT include the PCA results into your exploration)?
Are there any unusual sites, i.e., spatial outliers,
for some of the variables? Then concentrate on
the average annual temperature (and average annual precipitation)
and use at least 3 different approaches to predict temperature (and
precipitation) at any site in South America that is located
under 200 metres above sea level
(co-kriging would be an option but is not necessarily required).
Test your 3 approaches in the following way:
Predict the value for each known location, calculate the
residuals, and plot residual maps. Also, calculate
the mean squared error and the maximum absolute error.
Does one approach
work better for temperature and another approach work
better for precipitation?
Group B: Use the precision agriculture "PrecAg1"
data set from Gotway/Hartford that is already
available in ArcView/XGobi. Note that this data set
contains many missing observations and could be
addressed as spatially continuous data or as
area data. Some particular questions are:
Are there any unusual sites, i.e., spatial outliers,
for yield or nitrate? Then use at least 3 different approaches
to predict yield (and
nitrate) at any location in the field
(co-kriging would be an option but is not necessarily required).
Test your 3 approaches in the following way:
Predict the value for each known location, calculate the
residuals, and plot residual maps. Also, calculate
the mean squared error and the maximum absolute error.
Does one approach
work better for yield and another approach work
better for nitrate?
Due to the lattice-like structure of the data, does
median polish or the removal of any obvious outlier
have any effect on the errors?
P.S.: If you find any additional literature on any of the
2 data sets, please provide me with a copy. Thanks.