Welcome to my Open Notebook

This is an Open Notebook with Selected Content - Delayed. All content is licenced with CC-BY. Find out more Here.

ONS-SCD.png

Suicide And Drought Evidence From Literature

This essay evolved as I was asked to write something for the High-level Meeting on National Drought Policy (HMNDP) in Geneva with Mark Howden, Steven Crimp and Bryson Bates. It is a summary of material Colin Butler and I wrote for our paper with Phil Kokic and Mike Hutchinson on Suicide and Drought published in PNAS last year.

Suicide and Drought

By Ivan Hanigan and Colin Butler

There has been substantial public interest within Australia in recent decades of the putative relationship between drought and rural mental health, including suicide. The topic has frequently been raised by the media, by rural politicians and by mental health support groups [1]. There have also been recent media reports in India indicating substantial concerns about drought and rural suicide in that country, including [2] from May 2012.

The number of studies that have examined the relationship between suicide and drought is limited. However, many papers explore links between suicide and climate variables other than drought (such as temperature) and there are two major reviews papers available of the literature on climatic influences on suicide [3, 4]. Some climatic variables related to dryness have been studied for example Preti (1998) [5] found higher suicide rates in drier towns in Italy. But we make the point that very few studies have investigated the “dryer than average conditions” that is drought specifically.

There are several mechanisms through which unusually low rainfall, especially if exacerbated by increased soil dryness due to higher temperatures may increase the suicide rate. First, droughts increase the financial stress on farmers and farming communities (even if partially compensated by drought relief welfare payments). Such difficulty may occur in conjunction with other economic stresses, such as rising interest rates, falling commodity prices, or an unfavourable foreign exchange rate. In the broader economic system, reduced rainfall can depress economic activity in rural towns. In some regions the entire economy may be affected. Rural downturns can accelerate migration to metropolitan areas; weakening and stressing social support systems and lessening social interaction. In some cases rural depopulation may pass a tipping point, leading to an ongoing loss of critical services, such as hospitals, schools and doctors. Second, there can be a great psychological toll following environmental degradation [6] and this may be acute during droughts linked with decisions and actions to sell or kill starving animals or to destroy orchards and vineyards, which in some cases were painstakingly accumulated over generations. Such loss, and even the apprehension of loss, undoubtedly places a burden on the mental health of farmers and their families. This mourning may not be confined to farmers but extend to other sections of the community likely to be impoverished by long-term environmental degradation. The experience of seeing suffering wild plants and animals, or parched urban parks and gardens, and contemplation of their loss is likely to be extremely painful for some individuals.

Evidence from literature

As mentioned the number of studies that have examined the relationship between suicide and drought is limited. One analysis of annual suicide rates in NSW found an association between suicide and year-to-year decline in annual rainfall between 1964 and 2001 [7]. In that study a decrease of 300mm of rain was associated with an increase in suicide rate of about 8% above the mean annual rate. Another study of NSW, for the period 1901-1998, found an association between suicide and drought. That study focused on the association of conservative government and suicide [8]. The authors argued conservative government programmes (or perceived prospects under a government) might influence suicide directly, or that a correlated increase in anomie (decreased inclusiveness of society) and lowering social capital enhances risk of suicide in vulnerable individuals. The authors controlled for drought (among other things), and found that drought years were associated with an increased suicide risk of about 7% for men and 15% for women, across the whole population. A third study [9] found no association between drought and suicide in Victoria but was based on only 7 years data (2001-2007) and did not stratify the population of the state into regions which may introduce bias in the exposure estimates of drought affected people.

In contrast a longer running study did find an association using 38 years of data (1970-2007) to explore potential drought effects, especially on farmers and farm workers [10]. The drought exposures were calculated from climatic data for 11 subregions of New South Wales, and stratified by rural/urban region, age and sex. A strong association was observed in rural males aged 10-49. Surprisingly the suicide risk decreased in rural females aged over 30.

This study provides clear evidence to support the hypothesis that male farmers, farm workers and farming families are at risk of depression and suicide due to droughts. The resulting statistical model estimated that around 9 % of rural suicides in males aged 30-49 were due to drought over the entire study period. This estimate is an average over the course of the 38 years of the study, as the majority of years are not droughts - the percentage is much greater than 9 % in the actual drought years, since these are episodic and confined to a distinct minority of years. The statistical model also controlled for other well-known trends in suicide data, including that times of unusually high maximum temperatures increased suicide risk, that there was a increased risk in spring and early summer, and that there was a marked drop in suicide rates over the last decade.

Discussion

These studies from Australia offer some lessons for policy makers. The results identify a suite of contributing factors that influence suicide drawn from the environmental, social and political context of life in Australia, which drought is a part of. In particular these results help isolate the most critical times of risk, so that the best use of resources might be made. This includes provision of counselling services to target vulnerable people and get them help, both during droughts, at times with hotter than average maximum temperatures and during the dangerous spring period. Other policy implications from this finding support broadening investment in research into gender specific drought effects rather than purely climate and economic focused research into drought impacts.

References

  • [1] Australian Broadcasting Commision (ABC) News. Drought lifts suicide rates: Kennett, 2006. http://www.abc.net.au/news/2006-10-13/drought-lifts-suicide-rates-kennett/1285734

  • [2] P Sarathi Biswas. Alcohol, drought lead to farmer’s suicide. Daily News and Analysis, 2012. http://www.dnaindia.com/pune/report_alcohol-drought-lead-to-farmers-suicide_1688976

  • [3] PG Dixon and AJ Kalkstein. Climate-suicide relationships: A research problem in need of geographic methods and cross-disciplinary perspectives. Geography Compass , 3(6):1{14, 2009.

  • [4] E A Deisenhammer. Weather and suicide: the present state of knowledge on the association of meteorological factors with suicidal behaviour. Acta Psychiatrica Scandinavica , 108(6):402{409, 2003.

  • [5] Antonio Preti. The infleuence of clima te on suicidal behaviour in Italy. Psychiatry Res , 78(1-2):9{19, 1998.

  • [6] PC Speldewinde, A Cook, P Davies, and P Weinstein. A relationship between environmental degradation and mental health in rural Western Australia. Health and Place, 15, 2009.

  • [7] N Nicholls, CD Butler, and IC Hanigan. Inter-annual rainfall variations and suicide in New South Wales, Australia, 1964-2001. International Journal of Biometeorology, 50(3), 2006.

  • [8] A Page, S Morrell, and R Taylor. Suicide and political regime in New South Wales and Australia during the 20th century. Journal of Epidemiology and Community Health, 56, 2002.

  • [9] Robyn Guiney. Farming suicides during the Victorian drought: 2001-2007. The Australian Journal of Rural Health, 20(1):11–5, February 2012.

  • [10] I. C. Hanigan, C. D. Butler, P. N. Kokic, and M. F. Hutchinson. Suicide and drought in New South Wales, Australia, 1970-2007. Proceedings of the National Academy of Sciences, pages 1112965109–, August 2012.

Posted in  extreme weather events


Timeseries with Spatial Lag

adjacency example

adjacency example


1 Introduction

I've got a timeseries model I am fitting to a city dataset with about 45 zones. The data are daily, stratified by Zone, Age and Sex. Following on from learning about spatiallly correlated errors I want to see if the Standard Error on the estimated \(\beta_{1}\) from the timeseries model is affected.

I think the simplest option is to use the spatial lag model, which can be fitted with just adding a term that is the average of the set of each Zone's neighbours outcome level on each day. For this I need to find the list of each region's neighbours. Then I'll use this to assign each zone/day/age group their neighbours values and then collapse that to get their daily means.

2 Load some test data

# we have access to a classic dataset for studying spatial dependence
# in the spdep package
if(!require(spdep))    install.packages(spdep); require(spdep)     
if(!require(rgdal))    install.packages(rgdal); require(rgdal) 
if(!require(maptools)) install.packages(maptools); require(maptools) 
if(!require(maps))     install.packages(maps); require(maps) 
fn <- system.file("etc/shapes/eire.shp", package="spdep")[1]
prj <- CRS("+proj=utm +zone=30 +units=km")
eire <- readShapeSpatial(fn, ID="names", proj4string=prj)
str(eire)
# reproject into a better coordinate system
eire <- spTransform(eire, CRS("+proj=longlat +datum=WGS84"))
# check out the attributes
head(eire@data)

AtownspalesizeROADACCOWNCONSPOPCHGRETSALEINCOMEnames
34.20.121108736648.69729627185Carlow
29.680.01021335000156944529459Cavan
26.540.01053543211978346012435Clare
23.920.030147641189902840265901Cork
27.910.03098975002775747817626Donegal
32.790.6111810530789.414289424164631Dublin

3 spdep calculates neighbours

nb <- poly2nb(eire)
str(nb)
#List of 26
nb[[1]]
#[1]  9 10 11 25 26
# So this returns the set of index values for each area's neighbours
# I'd prefer to read their names
eire[['names']][1]
# > [1] Carlow
# so therefore the neighbours of area 1 "Carlow" are in the first
# element of the list
eire[['names']][nb[[1]]]
# > [1] Kildare  Kilkenny Laoghis  Wexford  Wicklow

4 plot these

################################################################
# name:plot these
png("images/Fig1.png")
plot(eire)
plot(nb, coordinates(eire), add=TRUE, pch=".", lwd=2)
map.scale(ratio = F)
box()
dev.off()

images/Fig1.png

5 function to return adjacency list as a dataframe

I THINK I actually want this as a dataframe so I can merge it with the master table of outcome data.

################################################################
# name:adjacency_df
adjacency_df <- function(NB, shp, zone_id)
  {
    adjacencydf <- as.data.frame(matrix(NA, nrow = 0, ncol = 2))
    for(i in 1:length(NB))
    {
      if(length(shp[[zone_id]][NB[[i]]]) == 0) next
      adjacencydf <- rbind(
                           adjacencydf,
                           cbind(
                                 as.character(shp[[zone_id]][i]),
                                 as.character(shp[[zone_id]][NB[[i]]])
                                 )
                           )
    }
    return(adjacencydf)
  }

6 test-adjacency df

################################################################
# name:adjacency_df
adj <- adjacency_df(NB = nb, shp = eire, zone_id = 'names')
adj  
CarlowKildare
CarlowKilkenny
CarlowLaoghis
CarlowWexford
CarlowWicklow
CavanLeitrim
CavanLongford
CavanMeath
CavanMonaghan
CavanWestmeath
ClareGalway
ClareLimerick
ClareTipperary
CorkKerry
CorkLimerick
CorkTipperary
CorkWaterford
DonegalLeitrim
DublinKildare
DublinMeath
DublinWicklow
GalwayClare
GalwayMayo
GalwayOffaly
GalwayRoscommon
GalwayTipperary
KerryCork
KerryLimerick
KildareCarlow
KildareDublin
KildareLaoghis
KildareMeath
KildareOffaly
KildareWicklow
KilkennyCarlow
KilkennyLaoghis
KilkennyTipperary
KilkennyWaterford
KilkennyWexford
LaoghisCarlow
LaoghisKildare
LaoghisKilkenny
LaoghisOffaly
LaoghisTipperary
LeitrimCavan
LeitrimDonegal
LeitrimLongford
LeitrimRoscommon
LeitrimSligo
LimerickClare
LimerickCork
LimerickKerry
LimerickTipperary
LongfordCavan
LongfordLeitrim
LongfordRoscommon
LongfordWestmeath
LouthMeath
LouthMonaghan
MayoGalway
MayoRoscommon
MayoSligo
MeathCavan
MeathDublin
MeathKildare
MeathLouth
MeathMonaghan
MeathOffaly
MeathWestmeath
MonaghanCavan
MonaghanLouth
MonaghanMeath
OffalyGalway
OffalyKildare
OffalyLaoghis
OffalyMeath
OffalyRoscommon
OffalyTipperary
OffalyWestmeath
RoscommonGalway
RoscommonLeitrim
RoscommonLongford
RoscommonMayo
RoscommonOffaly
RoscommonSligo
RoscommonWestmeath
SligoLeitrim
SligoMayo
SligoRoscommon
TipperaryClare
TipperaryCork
TipperaryGalway
TipperaryKilkenny
TipperaryLaoghis
TipperaryLimerick
TipperaryOffaly
TipperaryWaterford
WaterfordCork
WaterfordKilkenny
WaterfordTipperary
WaterfordWexford
WestmeathCavan
WestmeathLongford
WestmeathMeath
WestmeathOffaly
WestmeathRoscommon
WexfordCarlow
WexfordKilkenny
WexfordWaterford
WexfordWicklow
WicklowCarlow
WicklowDublin
WicklowKildare
WicklowWexford

</html>

Posted in  spatial dependence


Reflections on Spatial Regression Class with Prof Bob Haining

Reflections on a class with Prof Bob Haining

1 Introduction

I recently attended a class on spatial regression with Prof Bob Haining. He described the issue of spatially correlated errors and the problems this poses in spatial regression.

The key issue is that spatial data often violates the assumption in regression models that the errors are independent.

A simple regression model applied to spatial data based on ZONES:

\(Y_{i} = \beta_{0} + ZONE_{i} + \beta_{1} X_{1i} + e_{i}\)

But with spatial data it is likely that the errors are spatially correlated. This is likely to mean the point estimate of beta 1 is OK but the Standard Error is wrong.

This might be due to the scale of the study units, which may not capture the variation of exposure and outcome adequately. Or there might be unmeasured explanatory variables that have not been accounted for.

I am mostly concerned with EXPLANATORY modelling in which a particular exposure of interest is to be assessed. Examples include a weather variable (temperature), an air pollutant (PM10) or some measure of socio-economic deprivation in an area (SEIFA scores in Australian census data). In these models I tend to include a number of 'nuisance' parameters to control for confounding; or interaction terms to account for effect modification. In this type of model the performance of the model over-all is not that important, I just want to control for the most important confounders so that my estimate of the exposure of interest is as rigorous as possible.

Therefore the problem that spatially correlated errors pose for these models is slightly different to that which affects models aimed at PREDICTION: I am not concerned so much with the model's fit to the data, rather the confidence around the point-estimate of the parameter for the exposure of interest.

Simplistically I took away the following messages:

2 The Spatial Error Model

So we could model allowing for correlated errors:

\(Y_{i} = \beta_{0} + ZONE_{i} + \beta_{1} X_{1i} + \eta_{i}\)

Where:

\(\eta_{i}\) = Spatially autocorrelated errors.

3 The Spatial Lag Model

Or we could include a term for the neighbours, thus absorbing the correlated errors:

\(Y_{i} = \beta_{0} + ZONE_{i} + \beta_{1} X_{1i} + \rho(Neighbours Y_{ij}) + e_{i}\)

Where:

\(\rho_(Neighbours Y_{ij})\) = is an additional explanatory variable which is the value of the dependent variable in neighbouring areas.

4 Spatially Lagged Independent Variable(s)

This is almost a variation of the spatial lag model, except that we include a term for the exposure variable in the neighbours, and therefore 'smooth' the effect of the exposure from what was observed in any area to make it relevant to it's neighbours as well:

\(Y_{i} = \beta_{0} + ZONE_{i} + \beta_{1} X_{1i} + \beta_{2L} X_{2ij} + e_{i}\)

Where:

\(\beta_{2L} X_{2ij}\) = is the independent variable X2 that is spatially lagged.

5 Discussion

5.1 How to decide which model to fit?

So the burning question is how to choose between the various spatial models? Prof Haining had some suggestions, but he noted that sometimes two could be equally appropriate. He suggested that the spatial lag model makes the strong assumption that there is a relationship between the outcome in a neighbouring area with the index zone. This suggests some kind of contagion or dispersion effect. He was not keen to fit this model in circumstances where the causal mechanism did not support such a relationship, suggesting the spatially weighted error model was more suited, but that "in practice they often give the same result".

In my situation where I am not concerned with the actual autocorrelation but with tightening up the standard error on my exposure of interest, I think I might plead forgiveness and try fitting the spatial lag model as it seems easier.

6 Conclusion

Stay tuned.

</html>

Posted in  spatial dependence


software-ism

I am a huge fan of the R language for statistics and graphics.

I sometimes hear people say they don’t like R but then admit that they have never tried to use it, or if they have it was close to ten years ago (and a lot has changed).

In recent discussions at work I got the impression some people have got a bit predjudiced against R and other software that they don’t actually use, primarily because of the added difficulty of software that requires a bit of programming.

I think that multi-disciplinary work will inevitably mean we find a mix of software in use, and they’ll all have strengths and weaknesses. A major strength of R is that one can weave together a report that includes the data, code, graphs and interpretations for an analysis, rather than copy-and-pasting these elements together as is required with other software toolboxes.

For example a simple analysis in Rstudio using the ‘R Markdown document’ is below.

You can load and explore data in the document by placing ‘Code Chunks’ in the document, then when you click the Knit HTML button a web page will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:


summary(cars) --- 

| speed | dist | |————–|—————- | Min. : 4.0 | Min. : 2.00
| 1st Qu.:12.0 | 1st Qu.: 26.00
| Median :15.0 | Median : 36.00
| Mean :15.4 | Mean : 42.98
| 3rd Qu.:19.0 | 3rd Qu.: 56.00
| Max. :25.0 | Max. :120.00

You can also embed plots, for example:


plot(cars)

plot of chunk unnamed-chunk-2

I hope we can work toward a kind of ‘tower of babel’.

Posted in  research methods


The Ecological Fallacy is Itself a Fallacy

The Ecological Fallacy

The term ‘Ecological fallacy’ is used in Epidemiology and some other disciplines (such as Sociology) to refer to improperly inferring a causal association (or lack of association) at an individual-level based on a group-level relationship. This use of the word ecological is at odds with the alternative use of the word in the discipline of Ecology. Ecological methods from Ecology are inherently multi-scaled and address precisely the issue of this cross-scale inferential fallacy.

I argue that a broader understanding of ecological methods by non-Ecologists would be a start towards better understanding between the disciplines. The inclusion of real ecological methods in the other disciplines will also assist research to better understand the causes and effects of climate and climate change which are important gaps in knowledge needed to enable mitigation and adaptation of our society to future climate change.

Sociology

To clear up this confusion it is necessary to go back to the source of the use of the term ‘ecological’ by the urban sociologist Amos Hawley who used the term ‘human ecology’ in the 1950s to build on the theoretical traditions of Robert Parke and E. W. Burgess at the University of Chicago in the 1920s on the structure and development of cities. His influence in the Social Sciences led directly to the development a tradition of social human ecology, one largely without a bio-physical environment. The usage of ‘ecological’ to mean multivariate studies of complex systems stems from the sociological tradition.

Ecology

The term ‘ecology’ used in the Ecology discipline dates back to the 1870s, coined by German zoologist Ernst Haeckel (1834-1919) as Oekologie from Greek ‘oikos’ for house, dwelling place, habitation and ‘logia’ study of, then at the turn of the century the term became spelt ‘oecology’, which grew into ‘ecology’ in the early part of the 1900s.

Epidemiology

In epidemiology the term ‘ecological study’ is used to refer to studies where observations are taken at the level of a group (such as a country, school, or hospital) rather than at the individual (such as patient) level. It is well known though that when risk factors and outcomes are measured at an aggregate level, the relationship between the group-level variables may be different than the relationship between variables measured at the individual level. An often cited example used to illustrate the issue involved a 19th century study which found higher suicide rates within Prussian provinces that had higher proportions of Protestant residents. The conclusion that Protestant individuals (rather than Catholic individuals) were more likely to commit suicide cannot be inferred based on the observed association among the provinces. One possible scenario is that Catholic residents within the largely Protestant provinces had the high suicide rates, resulting in a positive association between percent Protestant and suicide rate 8. Extrapolation of aggregate results to individuals is a mistake in logic 9 which can lead to a potentially misleading conclusion 10.

Because of this limitation ‘ecologic studies’ are often scorned in epidemiology as inferior and only useful for exploratory or hypothesis-generating studies rather than as confirmatory. I argue to the contrary that there is the potential for a revolution in our ability to understand causal influences operating at multiple scales of space and time if we were to conduct truly ‘ecological studies’.

Quantitative Geography

There is also a closely related concept that should be noted. That of the Modifiable Areal Unit Problem (MAUP) as discussed in quantitative geography. There is a large amount of geographical literature on the MAUP. In this problem domain areal units (also called zones, regions, areas or polygons) inherently pose three problems to a researcher attempting to infer causal associations: scale, zonal and temporal:

  • Scale; this issues is evident in the example above where phenomena investigated using data viewed at one scale may appear quite different (even opposite) using data aggregated at a different scale.
  • Zonal; the zonal problem appears where phenomena investigated using data viewed using two sets of differing areas at a single scale can differ.
  • Temporal; a third problem arises when analyzing data on modifiable areas when people keep modifying them by redrawing the boundaries over time.

In Conclusion

The term ‘Ecological Fallacy’ is itself a fallacy and non-Ecologists should be made aware of the existence of alternative ecologic methods from Ecology. This would be a start towards better understanding between the disciplines and enhance our abilities to mitigate and adapt to climate change.

Posted in  research methods