Welcome to my Open Notebook

This is an Open Notebook with Selected Content - Delayed. All content is licenced with CC-BY. Find out more Here.

ONS-SCD.png

Dr Tom Ford disentangles climate change writing

Climate change and contemporary fiction

My friend Dr Tom Ford blogs about how climate and climate change are entangled in contemporary fiction. For a social scientist he does a really good job of disentangling esoteric and specialist environmental science knowledge with literary waffle and voodoo (sorry, that’s an in-joke… I consider my self a waffly exponent of enviro voodoo too - but more than happy to cast aspersions and sling defamatory insults around).

Tom’s bag as far as I can tell is to reflect on the creation and development of literary constructs used by writers to talk about climate, and climate change. Is this a whole new category of post modern and existentialist literature?

Posted in  disentangle things


The Organisation of Material

During the course of my research I have repeatedly found the organisation of material to be a challenge (I’m talking about the code, data, text and everything else related to analyses). One of the things I often struggle with is just keeping my thoughts clear and consistent between projects, through weeks and across years. I’ll try to blog about how I have decided to manage my work, and the tools I have tried and end up using.

To start with, I’d like to share a passage taken from pages 55-57 of Arthur Koestler’s “The Ghost in the Machine”, 1967, London, Pan Books, with some rephrasing of my own.

“The vexed problem of the ‘organisation of material’; vexed because the different aspects of the problem, the welter of evidence and the welter of interpretations, are all interconnected like threads in a Persian carpet. The author is keenly aware of the pattern they form; but how can he convey that pattern if he has to unpick the threads in order to explain them one at a time? Here the problem of temporal order begins to intrude, although his mind may still be functioning in the partly or wholly non-verbal regions of images and intimations.

At last he arrives at a tentative arrangement of his material, under a series of headings and sub-headings, which he shuffles about as if they were compact building blocks. They are probably each represented by a mere jotted key-word….

…now the time has come for these intentional seeds to start growing into saplings which will branch out into sections, sub-sections, and so on: the selection of evidence to be quoted, of illustrations, comment and anecdotes, each of them necessitating further strategic choices. At each node - branching point - of the growing tree, more details are filled in, until at last the syntactic level is reached, the phrase generating machine takes over, the individual words are lined up - some effortlessly, some after a painful search, and are finally transformed into patterns of contractions of finger muscles guiding a pen: the logos has become incarnate.

But of course the process is never quite as neat and orderly as that; trees do not grow in this rigidly symmetrical way. In our schematised account, the selection of the actual words occurs only at an advanced stage of the process, after the general plan and the ordering of the material have been decided on, and the buds of the tree are ready to burst open in their proper left-to-right order. In reality, however, one branch somewhere in the middle might blossom into words, while others have as yet hardly started to grow. And while it is true that the idea precedes the actual process of verbalisation, it is also true that ideas are often airy nothings until they crystallise into verbal concepts and acquire tangible shape….

Thus our tree progresses with irregular growth and constant oscillations between levels. Transforming thought into language is not a one-way process; the sap flows in both directions, up and down the branches of the tree. The operation is further complicated and sometimes brought to the verge of a breakdown by the author’s deplorable tendency to correct, erase, chop off entire flowering branches from the tree and start growing them afresh”.

Posted in  overview


My Interests

My research interests revolve around making my data analyses easier and more reproducible. I’d like it if every step in the methodical exploration of (or prediction from) data is easily documented, reproduced, transformed, integrated and fun … (if such a thing is possible).

I conceptualise data analyses as grouped networks of the many choices and revisions an analyst makes through a complex workflow in a project, clusters of which make up the many projects, databases and codebases analysts use everyday.

In my work as a data manager at a research school, I spend a lot of time linking together datasets to analyse population, health and environmental dimensions. There are complex relationships to be found but because there are so many steps required for such an analysis workflow, and there are strong barriers to reproducibility. A key barrier is due to the difficulties of tracking and documenting the numerous generations of derivative datasets and analyses. I have developed an interest in software applications used during workflow actions and decision making to document a reproducible module of work; in desperately entwined and heroically integrated analyses.

My Topics

I am interested in systems analysis, especially of environmental health systems and

  • interventions
  • catastrophes
  • explanations
  • predictions

I also focus on

  • Sucide and Drought in Southern Australia

and Atmospherics, including:

  • pollution events from Bushfires, dust storms, aeroallergen peaks
  • average exposures and extremes to temperature, humidity, rainfall… combinations of these

I also dabble in other environmental health issues such as mosquito diseases and drinking water pathogens.

My Tools

I try do all my data integrations and analysis in R/Sweave. I am currently planning to investigate the following tools

  • R/Sweave/make
  • packages
  • python
  • version control (git)
  • graphviz
  • projectTemplate
  • workflow apps: kepler\taverna\RanalyticFlow\Rgraphviz
  • disentangleThings

My Intentions

In this blog I will write about my experiences as I dealve into these new areas and write a PhD about them (and my research topics, knotted problems to be disentangled as we go). I am not a very adept programmer, but hopefully these tools will help enable me to deal with the intergrated analyses I hope to do.

Posted in  overview


About My Research

I am enticed by the work out there at the moment that challenges us to use the tools of open science (open software and open access publication) to make science more reproducible, transparent and awesome. Science December 2 2011 Volume 334 (URL here) has a special section on replication and reproducibility. Especially Roger Pengs perspective on: limitations in our ability to evaluate published findings. Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.

At the moment I am keenly aware of barriers to replicability due to the voluminous generation of data and the multitude of analysis workflow decisions that can affect the results of the kind of integrated analyses I am involved with … choices include those relating to: which health outcomes are selected? which exposure estimates? of which environmental dataset?

How data is linked together; population, health and environmental data is relevant to our ability to disentangle the complex relationships to be found. I’ve had a difficult time due to multiple revisions on datasets and analysis plans that come from working with multidisciplinary teams of epidemiologists, environmental scientists and biostatisticians. I studied geography and ecology in my undergraduate degree, so am able to link multiple layers of data together in a Geographical Information System (GIS), but what I think I need is an Integrative Information Systems (IIS has a nice ring to it).

I try do all my weather/air pollution/health/demography data integrations and analysis using the Reproducible Research Reporting paradigm implemented in R/Sweave. I do this so I can maintain tight control over modelling assumptions or data decisions at any point in the workflow, including multiple versions over the course of an evolving analysis plan. This allows me to ‘drill down’ into parts of the data preparation and analysis many months after the bulk of the work has been done, change key portions that respond to the changed requirements of the project, and document the reason for the changes (in case of the inevitable change in requirements, see this xkcd comic strip here

So in summary, I am interested in tools that enable analysts to deal with the issues of intergrated analysis networks, and tracking the many choices an analyst makes through a complex web of the analysis workflow between data, analysis, reporting and archiving activities.

Posted in  overview


About Me

About Me

I am a data manager and analyst at NCEPH http://nceph.anu.edu.au/ at the Australian National University. I am also undertaking a multi-disciplinary PhD, on a part time basis.

My research focus is on linking population, health and environmental data for epidemiological analysis.

I apply integrative techniques to analysis of environmental health issues such as: 1 Suicide and Drought 2 Air pollution, weather and mortality or morbidity 3 Aeroallergens 4 and more.

This blog reflects my work on these various analyses, and the tools and techniques I apply there.
It is called Disentangle Things as both a description of what I try to do (disentangle is subtly different from untangle - more on this later) and also an exhortation.

So let’s Disentangle Things!

Posted in  research methods