Spatiotemporal Regression Modelling
Table of Contents
1 Spatially Structured Timeseries Vs Spatiotemporal Modelling
In my last post about spatiotemporal regression modelling I mentioned that I am mostly interested in "spatially structured time-series models" rather than spatial models at a single point in time. By this I mean that we have several neighbouring areal units observed over a period of time. In this framework the general methods of time series modelling are used to control for temporal autocorrelation. However this makes the methods of spatial error and spatial lag models tricky because the spatial autocorrelation needs to be assessed at many points in time.
I want to expand more on this topic because I want to be clear that the organisation of the material I am aiming to bring to this notebook topic is not aimed at purely spatial regression models (there is a lot of material and tools out there already for that). I am trying with these notes to document my learning steps toward integrating spatial methods with time-series methods to allow me to practice (and understand) spatiotemporal regression modelling.
1.1 Spatially Structured Time Series
In my most successful previous attempt to conduct a spatiotemporal analysis of Suicide and Droughts I built on my knowledge of time-series regression models from single-city air pollution studies where the whole city is the unit of analysis and the temporal variation is modelled with controlling techniques for temporal autocorrelation. These techniques are also valid for multi-city studies because it is pretty safe to assume the cities are all independent at each time point. I structured my study by Eleven large zones (Census Statistical Divisions) of NSW and assumed each of these would vary over time independent of each other, and I fitted a zone-specific time trend and cycle. This is what I call "spatially structured time-series" modelling.
I justify using this model in this case because aggregating up to these very large regions will diminish the possibility of spatial autocorrelation and because Droughts vary over large spatial zones too, we will not suffer from exposure misclassificaiton bias.
So this model is a simple time-series regression (with trend and seasonality) and an additional term for spatial Zone.
\begin{eqnarray*} log({\color{red} O_{ijk}}) & = & s({\color{red}ExposureVariable}) + {\color{blue} OtherExplanators} \\ & & + AgeGroup_{i} + Sex_{j} \\ & & + {\color{blue} SpatialZone_{k}} \\ & & + sin(Time \times 2 \times \pi) + cos(Time \times 2 \times \pi) \\ & & + Trend \\ & & + offset({\color{blue} log(Pop_{ijk})})\\ \end{eqnarray*}
Where:
- \({\color{red}O_{ijk}}\) = Outcome (counts) by Age\(_{i}\), Sex\(_{j}\) and SpatialZone\(_{k}\)
- {\color{red}ExposureVariable} = Data with {\color{red}Restrictive Intellectual Property~(IP)}
- {\color{blue}OtherExplanators} = Other {\color{blue}Less Restricted} Explanatory variables
- s( ) = penalized regression splines
- \({\color{blue} SpatialZone_{k}}\) = {\color{blue} Less Restricted} data representing the \(SpatialZone_{k}\)
- Trend = Longterm smooth trend(s)
- \({\color{blue}Pop_{ijk}}\) = interpolated Census populations, by time in each group
1.2 TODO Spatiotemporal modelling
In contrast to the above model for modelling exposures that have fine resolution spatial variation (such as air pollution) the exposure misclassification effect of aggregating up to very large spatial zones will conteract the benefits of avoiding spatially autocorrelated errors and this might be unacceptable for certain research questions. Therefore it is important to move toward a spatiotemporal regression model that replaces the \(SpatialZone_{k}\) term with a more spatial error or spatial lag approach.
</html>