GEO 866 (Fall 2016) - Spatial Data Analysis

Website: http://www.msu.edu/~ashton/classes/866
Printable Course Syllabus (pdf)

I. Instructors
Dr. Ashton Shortridge
235 Geography Building
phone: 432-3561
email: ashton@msu.edu
office hours: 3-4 pm Monday and 10-11:30 am Thursday

Ms. Jovanka Nikolic, Teaching Assistant
1A Geography Building (basement)
email: nikolicj@msu.edu
office hours: Monday 10-12 pm, or by appointment

II. Time and Place
Tuesday / Thursday 8:30 - 9:50 am, 126 Geography Building (Lecture)
Tuesday / Thursday 10:20-12:10 pm, 201 Geography Bldg (Computer Lab)
Final Exam Slot: Monday, December 12, 7:45-9:45 am

III. Course Objectives
Spatial is special, and special forms of analysis are required for handling spatial data. Spatial statistics is a cover-all term for a diverse set of methods that describe and model characteristics of spatial data. In some cases spatial location is the only factor being analyzed (disease point pattern). In other cases the primary interest concerns an attribute present everywhere but sampled only at a subset of locations (digital elevation models). A third set of cases involves the analysis of data collected and stored in spatial zones (U.S. Census data). While these three cases are by no means exhaustive, they do represent the wide range of applications that we will deal with in this course.

Our objective is to learn and employ basic statistical techniques for describing, modeling, and analyzing these three basic types of data. These techniques include:

• point pattern analysis
(kernels, nearest neighbor and K statistics, identifying hot spots and clusters)
• methods for continuous data
(interpolation, trend surface analysis, geostatistics - kriging, variography)
• spatial regression
(OLS regression assumptions, autocorrelation measures, spatial regression models)

This course will cover theory and application of these techniques. You will develop a diverse and powerful set of analytical techniques for gaining insight into geographical processes and patterns.

10% Class Participation: Contribute to discussion on readings, ask and answer questions, don't skip or come late to class.

25% Final Project: Proposal, Presentation, and Paper. Chose one of the following:

1. Perform statistical analysis of your own data
2. Analyze a method used in the class
More on projects in Section VII, below

20% Exams: Two exams, 10% each. Cover material from lecture and lab.

45% Homework: Lab-based exercises

1. You obtain data + R script & questions in lab.
2. You run the script, analyze results, and write short report.
3. Reports typically due by the following week's lab section.
Writeups are expected to answer every question in a brief report-style format Grading is based on quality of your responses to the questions, measured as a function of their correctness and clarity.

Barring very special circumstances, late homeworks, project preproposals, etc. will not be graded.

Plagiarism is the use of others' ideas without identifying the source. It is one of the most serious academic offenses. Carefully read and understand the full MSU policy on this matter. If I find evidence of plagiarism, improperly attributed group work, or cheating on an exam, I will issue a failing grade on the assignment or for the course, at my discretion, and report the conduct to University authorities.

V. The Book (not required)

Applied Spatial Data Analysis with R. (2008) R. S. Bivand, E. J. Pebesma, & V. Gomez-Rubio. Springer. Available for free electronic download on MSU IP addresses. This book covers all the basics on using R for doing spatial stats. Lots on data structures, the interface between GIS and spatial analysis, as well as how to conduct specific analysis. They have a lot of good applied "use R" texts - maybe one in your subfield.

The course is based in part on Bailey & Gatrell (1995) (see below), but this text is not required. There will be occasional assigned readings.

VI. The Software
We will be using the R statistical package. R is really a specialized programming language that has two major advantages over other options:

• It is open-source, free software, so anyone with a computer can install and use it without worrying about licenses, fees, etc.
• Hundreds of specialized extensions have been developed to make R even more versatile. We will be using some of these add-on libraries to do powerful spatial statistics

R definitely has a learning curve associated with it. It has been described as a programming language that happens to have a lot of stats functionality. A primary objective of this course is to gain experience at using R to explore and analyze data. More details about R, including very brief installation instructions, are available at this 866-specific page, or you can visit the official R page with lots of information.

VII. The Project
The project involves original work using some spatial data and several of the techniques discussed in class. I can envision two separate types of projects:

• You may have a specific problem in mind, and some data you want to analyze. For example, if your current research interest is studying yield variation across a corn field, and you intend to use this class to help you develop techniques to tackle this problem, then it would make sense to use some existing yield data for your project.
• You may be interested in exploring a statistical method more deeply. For example, you might investigate the sensitivity of, say, simple kriging to variability in the spatial covariance model (this particular example will make more sense later).

In either case, limiting the size of the problem is a good idea. Some people discover they have serious data collection or input problems rather late in the semester, and end up with limited time to perform the analysis, do the writeup, and develop the presentation. Most importantly, the project must employ methods covered in class.

Group projects (two people in a group) are fine; each member will receive the same grade.

1. Friday, November 18, 5pm: Project Preproposals Due
2. Tuesday, November 29, 5 pm: Project Proposals Due
3. Monday, December 12: 7:45 am: Presentations
4. Tuesday, December 13, 5 pm: Paper Due

The proposal should be a 1-2 page typed document that accomplishes four things:

1. identifies the research problem and the research team.
2. indicates the data required to work on the problem, whether you have it or not, etc.
3. outlines techniques and methods you will use on the data. If you will not be using R, explain why and list the package(s) you will use
4. indicates a time line for completing the project (data input, analysis, writeup, presentation design)

Take the proposal very seriously; large project deviations from what you said you'd do must be explained in the final report.

We will employ a conference-style format, with a 15-minute time limit. Laptop and LCD projector will be provided. I expect a professional, polished, and rehearsed performance. Students from previous GEO 866 courses have gone on to publish and present their work at major academic conferences, and that is the standard I am shooting for this semester.

The paper should be 8-12 pages (double-spaced) in length. It is a research report; write in an appropriate style, include relevant references, and express yourself clearly and succinctly. The paper should indicate your research problem and significance, describe your methodology, and report on your findings. It is particularly important that you describe the statistical techniques you employ and explain why you are using them. Figures and tables should be included if they are helpful. If your report includes ideas from other people or works, you need to cite them. Failure to indicate sources is plagiarism. If you are unsure, reference or talk to me about it.

VIII. Other resources

Interactive Spatial Data Analysis. (1995) T. C. Bailey & A. C. Gatrell. Addison-Wesley.A classic textbook used in this course for years.

Use R publications from Springer, including Applied Spatial Data Analysis with R, are available online for free from MSU machines.

R-sig-Geo - a mailing list and archive for help with spatial R issues

A decent statistics text (e.g. Ott, An Introduction to Statistical Methods & Data Analysis)

Online R tutorials and Manuals:
Official and Contributed
Thoughts about R for programmers coming from another language.

Data:
World boundary data in RData format at GAdm.
World boundaries and other themes at Natural Earth.

IX. Weekly Outline
A weekly outline of lecture notes will be available. This is not a substitute for attending lecture! Students who do not attend regularly fail this course!