Welcome to my Open Notebook

This is an Open Notebook with Selected Content - Delayed. All content is licenced with CC-BY. Find out more Here.


Time series regression in R course codes for Chicago Heatwave

I am giving a talk and R demonstration tomorrow for the Epidemiology Special Interest Group (EPISIG) on Time series regression in R for heat and heatwave analysis.

I hope some of the audience will bring their laptop and use install.packages('dlnm') to follow along.

Below are links to the training resources I will use:

Distributed Lag Non-Linear Models (DLNM) package + recent “TSCOURSE”:

Codes for today’s talk

Posted in  training

Climate Change policy inaction threatens lives

Today we are launching a new Australian climate change and health paper that I'm involved with.

Zhang Y, Beggs PJ, Bambrick H, Berry HL, Linnenluecke MK, Trueck S,
Alders R, Bi P, Boylan SM, Green D, Guo Y, Hanigan IC, Hanna EG, Malik
A, Morgan GG, Stevenson M, Tong S, Watts N, Capon AG. The MJA–Lancet
Countdown on health and climate change: Australian policy inaction
threatens lives. The Medical Journal of Australia
2018;209(11):1.e1- 1.e21. doi: 10.5694/mja18.00789.

Also, the release of the related Global edition is the Lancet Countdown 2018 Report

I suspect that the "Australian policy inaction threatens lives" theme may reappear under a number of guises. Still, at least we are in the bandwagon!

Our team amassed data on a large range of indicators that we put together to show the current impacts of climate change, from deaths due to human-made air pollution, lethal extreme weather events, infectious mosquito borne diseases and the public awareness of climate change issues and policies. We intend to re-visit these indicators regularly each year and track the ongoing developments of these emerging public health issues.

Events will be today at Sydney University TODAY and tomorrow Fri 30th 3pm Canberra launch event that I'm going to. Our Ministers Fitzharris and Rattenbury will be there.

It is too late to RSVP for the Sydney event but RSVP by 29 November to Christian Dent at rattenbury@act.gov.au for the Canberra event.

Time: 3pm, 30 November, 2018, followed by Reception Venue: Exhibition Room, Legislative Assembly Building, Canberra

Posted in  climate change air pollution policy relevant research

IUCN Red List Ecosystems Framework Applicability To Health Impact Assessments

A couple of years ago I was fortunate to work on an IUCN Red List of Ecosystems project.

  • Guru, S., Hanigan, I. C., Nguyen, H. A., Burns, E., Stein, J., Blanchard, W., … Clancy, T. (2016). Development of a cloud-based platform for reproducible science: A case study of an IUCN Red List of Ecosystems Assessment. Ecological Informatics, 36, 221–230. http://doi.org/10.1016/j.ecoinf.2016.08.003

This is based on a well thought through framework:

  • The original exposition for IUCN ecosystem risk assessments was set out in Keith, D. A., Rodríguez, J. P., Rodríguez-Clark, K. M., Nicholson, E., Aapala, K., Alonso, A., … Zambrano-Martínez, S. (2013). Scientific Foundations for an IUCN Red List of Ecosystems. PLoS ONE, 8(5). http://doi.org/10.1371/journal.pone.0062111
  • There is an associated guidelines for a Risk Assessment outlined in Rodriguez, J. P., Keith, D. a., Rodriguez-Clark, K. M., Murray, N. J., Nicholson, E., Regan, T. J., … Wit, P. (2015). A practical guide to the application of the IUCN Red List of Ecosystems criteria S2. Philosophical Transactions of the Royal Society B: Biological Sciences, 370(1662), 20140003–20140003. http://doi.org/10.1098/rstb.2014.0003

I have recently been reminded in several instances of the overlap with Health Impact Assessments, especially in complex eco-social contexts, and when stakes are high (so if one risk factor is critical the entire assessment gets that level of risk) and where data are not available (the use of the Data Deficient category).

It is worth thinking of when can't achieve data quality or dose-response and/or base-line prevalence required for Burden of Disease methodological 'purity'.

The options in addition to main criteria (Critically Endangered –> Vulnerable -> Least concern) are: Data Deficient, Not Evaluated and Near Threatened (scoring close to the threshold). Overall threat status is the highest level of risk on any of the 5 criteria. We ended up using DD a lot, but still get pretty good assessments.

A final note: the work I did led me to re-order the sequence of steps from the implied order of the Criteria A, B, C, D and E.

In practice it makes more sense to start at Criterion D 'the Primary Biotic Variable', because defining this at the outset means that there are many fewer potential variables to actually include (because ecosystems are big and complicated things with many possible biotic variables of interest). This gives the whole assessment a focus and targetted feel.

So in my work I did this sequence:

Criterion D The primary biotic variable

Do this first! This allows the scope of the assessment to be defined around a particular set of key biological and abiotic attributes of the ecosystem.

This requires an assessment for three time periods of the response of the principal biotic variable:

  • D1 computes the disruption over the past 50 years
  • D2 describes the projected disruption of the next 50 years
  • D3 computes the disruption since 1750

Criterion E The probability of ecosystem collapse

Doing this second means that if there is any indication that this ecosystem is in peril then you can immediately classify it as "Critical" and not worry about doing any of the other steps. Just get on witht he job of saving it!

Criterion E is an overarching analysis of the impacts of biotic variables on the probability of ecosystem collapse within 50–100 years. We ran simulations to investigate the future resource

Criterion A Decline in ecosystem distribution

This is also done at three time periods:

  • A1) Decline in distribution current (past 50 years)
  • A2) Future decline (a) next 50 yrs or (b) any 50 year includes present and future
  • A3) Historic decline (decline since 1750)

Don't be afraid to use the "Data Deficient" category here!

Criterion B Restricted distribution

  • B1 extent (convex hull around observed occurences) AND ANY a) decline, b) threat, OR c) small N "locations"
  • B2 area of occupancy AND any a-c (assessed within 10km2 grid cells)
  • B3 small N locations AND threats in short term (ie in the case of a mega landscape fire we have: burnt and unburnt = 2 locations)

NB "Locations" are those discrete areas of the ecosystem that would be affected by the most pervasive plausible threat (ie wildfire).

Criterion C Decline in abiotic processes

The most important abiotic variables are Temperature and precipitation and other climate-related variables were used for Criterion C

  • C1 past 50
  • C2 next 50 IPCC
  • C3 since 1750

Posted in  policy relevant research

The Australian Air National Environment Protection Measure (NEPM) is about protecting human health

In Australia the Air Quality NEPMS https://www.legislation.gov.au/Details/C2004H03935 are a measure "to be implemented by the laws and other arrangements participating jurisdictions consider necessary". Performance against the Air NEPM is assessed at compliance stations located at sites representative of air quality likely to be experienced by the general population. Australia has had national standards and goals for ambient air quality since 1998 (althought the harmful fine particulates (PM2.5) was added in the 2016 revisions to the Air NEPM, to be reported on annually from June 2018). The Air NEPM mandates a consistent approach to air quality monitoring, which has been applied by all states and territories, but — recognising the different legislative arrangements in each jurisdiction — does not dictate the means to be applied to achieve the goals. Performance against the standards and goals is published annually.

In a recent unsuccessful grant application of mine, a reviewer commented that I had a lack of understanding of the process of the development of the standards and asserted:

"1) the criteria are not health-based as such, and the regulatory
policy decisions are not all about minimising health impact"


"2) the NEPM for ambient air quality only provides for ambient air
quality that allows for the "adequate" protection of human health
and well-being"

This reviewer makes the assertion that "adequate" protection of health is not the same as "minimising health impact". I guess the argument is that an "adequate" minimum level of health impact should consider a balance between the costs of expenditure on continued reductions in emissions against the benefits to health.

So I think that the criteria ARE health-based, and that the NEPM IS about minimising human health impacts (with the caveat that this is in the context of an economic cost-benefit assessment). For example in the US Colorado continually strives to reduce air pollutant emissions in ways that ensure public health and environmental protections, while maintaining a vibrant economy.

Such a cost-benefit analysis was included in my grant application, and I argue that this is vital to equip state Environmental Protection Authorities (EPA) with metrics to inform interventions. Such interventions are needed as Australian governments try to achieve the National Clean Air Agreement (made between the Commonwealth and each state and territory jurisdictions) which aims to implement strengthened laws that move to even tighter standards in 2025.

Posted in  air pollution policy relevant research

Filling empty cells in columns with R

Happy new year!

I have had a bit of a break and caught up with some of the older blog posts I did not find time for earlier in 2017.

This one caught my eye: http://varianceexplained.org/r/start-blog/ especially the line 'When you’ve given the same advice 3 times, write a blog post'. So I thought I'd write a blog post about a bit of R code advice. (Disclaimer: This wasn't originally my work, but I used this advice more than 3 times, so am sharing here. This is a function I got from my colleague Phil TnT).

In some cases I get data in a table that simplifies the presentation by only labelling one row in a set of rows like this:

   person        fruit    suburb something
      Tom      oranges   Scullin       3.0
                apples                 6.0
                 pears                 9.0
              tim tams                 2.0
 Gertrude       durian Charnwood       3.7
          dragon fruit                 7.0
               lychees                 4.9
             pineapple               100.9
                apples                98.0
Pennelope      cashews   Higgins       2.0
             beer nuts                 5.6
              Pringles                 4.0

To use this like tidy data we need to fill these intervening cells.

fill.col <- function(x, col.name) { s <- which(!x[[col.name]] == "") item <- x[[col.name]][s] hold <- vector('list', length(item))
for(i in 1: length(hold)) hold[[i]] <- rep(item[i], ifelse(is.na(s[i+1]), dim(x)[1] + 1, s[i+1]) - s[i]) x[[col.name]] <- unlist(hold) x }

d <- fill.col(d, 'person') fill.col(d, 'suburb')

  person        fruit    suburb something

1 Tom oranges Scullin 3.0 2 Tom apples Scullin 6.0 3 Tom pears Scullin 9.0 4 Tom tim tams Scullin 2.0 5 Gertrude durian Charnwood 3.7 6 Gertrude dragon fruit Charnwood 7.0 7 Gertrude lychees Charnwood 4.9 8 Gertrude pineapple Charnwood 100.9 9 Gertrude apples Charnwood 98.0 10 Pennelope cashews Higgins 2.0 11 Pennelope beer nuts Higgins 5.6 12 Pennelope Pringles Higgins 4.0

Posted in  disentangle