7 August 2021
An Introduction to Visualisation and Modelling of Spatial data (in R)
Suppose you are a public health consultant and you have recently implemented a national policy with the aim to reduce health inequalities across your country. You are of course also interested in knowing if the policy has been effective or whether the resources would be better spent elsewhere.
As an environmental researcher you are interested in determining the spatial trend in water ecology across a river network, you are also asked to assess whether this trend has changed over time.
Following the introduction of new legislation, you are asked to estimate and present the spatial pattern of air pollutants across your city. You are also interested in determining if there any other attribute that can help explain the pattern?
Spatial statistics can help answers these questions and by the end of this course you will be equipped with the appropriate tools to do so.
What is spatial statistics?
Spatial statistics involves the analysis, modelling and visualisation of spatial data and is an extension of time series data analysis, where observations are now made in 2D space. The key difference between spatial data and non-spatial data is that you cannot assume observations are independent and so common statistical methods are not suitable.
Tobler's first law encapsulates this dependence, "Everything is related to everything else, but near things are more related than distant things."
Thus, points closer in space are likely to be more similar (autocorrelation). The same feature arises in time series, where observations close in time are likely to be similar, whilst those further apart are more likely to be independent. Thus, spatial statistics, like time series analysis, requires us to model autocorrelation in the data.
Geostatistical data are data which can be recorded at any location across a region. They commonly arise in the environmental setting; however, they are also observed in other fields such as economics. Data of this kind are normally recorded at predefined locations such as weather monitoring stations or coordinate positions on a body of water. A primary goal when analysing geostatistical data is to understand and visualise what is happening in areas between observed locations i.e. prediction.
Areal processes can be applied to data collected on non-overlapping areal units, such as census tracts. In general, areal unit data are data collected over a study region which is partitioned into n contiguous small areas. For each of these areas, a response is observed. A common area where areal unit modelling is very popular is in disease mapping where the risk of disease is estimated to assess the extent and pattern of differences in disease risk over space.
In this course we will focus on modelling and predicting complex trends, and estimating the effect of other exploratory variables on the response variable, using geostatistical and areal data.
Dr Eilidh Jack, Lecturer
School of Mathematics and Statistics
University of Glasgow
This course will be useful for those who work with spatial data across a variety of fields. This course will suit anyone who is looking to learn about modelling and visualising their spatial data as well as those who would like to refresh what they have learned before. No prior knowledge of spatial statistics is required.
After this course you are able to:
1. Load spatial datasets into R and produce exploratory visualisations and summaries.
2.Identify spatial trends and autocorrelation.
3. Distinguish between areal unit and geostatistical data and apply corresponding methodology appropriately.
4. Interpret R output and produce high quality visualisations of results.
EUR 0: Fee and date will be announced in the fall. Registration opens December 1st 2020. The fee includes the registration fees, course materials, access to library and IT facilities, coffee/tea, lunch, and a number of social activities.