Welcome to my Open Notebook

This is an Open Notebook with Selected Content - Delayed. All content is licenced with CC-BY. Find out more Here.

ONS-SCD.png

My Newnode R Function Useful For Causal Directed Acyclic Graphs (DAGs)

Aims

I have worked on a function that turns a data.frame into a graphviz code in the dot language, with some of my preferred settings. I realised that it might be useful for causal directed acyclic graphs.

Causal diagrams are useful for conceptualising the pathways of cause and effect. These diagrams are sometimes simplly informal pictures but have also been developed in a more formal way to be used in modelling. These formal developments use concepts derived from the mathmatical abstraction of Graphs (fundamentally Graphs are networks of linked ‘nodes’, with the links being termed ‘edges’). Causal diagrams can either be constructed to depict two things: first are feedback loops (a vexatious property of complex systems that confounds modelling) while second are more simple chain-of-events type pathways which proceed from an upstream cause to a downstream effect in a single direction, without cycles, called ‘Directed Acyclic Graphs or DAGs. The loop diagrams are out of the scope of this present blog post because the DAGs are much more easily addressed by the tool that I am describing.

To begin I am going to build on this other guy’s blog post on causal DAGs with R http://donlelek.github.io/2015/03/31/dags-with-r/ I wanted to add an interface for building these.

Some background to the concepts that I use are provided in the references below.

Materials and Methods

The DiagrammeR package which has been integrated within R-Studio has made access to the graphing tool graphviz much easier than it used to be. My function causal_dag (avaiable in my disentangle github package) essentially constructs the required nodes and edges for that package to use. Optionally we can also include labels to indicate the direction of the effect.

To use the tool all you need to do is create a list of edges and their associated inputs nodes and outputs nodes (as a comma separated values string) shown in the picture below.

causal-ssheet.png

Code:

# read in the sheet
library(disentangle)
library(stringr)
causes <- readxl::read_excel("causal-ssheet.xlsx")
causes
nodes <- newnode(causes, "edges", "inputs", "outputs")
cat(nodes)
# The result is a formated graph in the dot language with some of my
# preferred settings such as edges showing as 'records' and a spot to
# write a description or include literature about each process

  • See the DOT code in the Appendix
  • to render the graph now DiagrammeR can use this text string R object to render this to SVG
  • I think it does not do PNG or PDF though so I still use graphviz and dot directly

Code:

grViz(nodes)

# But I also use graphviz directly to produce a publishable image in
# pdf or png
sink("reproduce-donlelek.dot")
cat(nodes)
sink()# If graphviz is installed and on linux call it with a shell command
#system("dot -Tpdf reproduce-donlelek.dot -o reproduce-donlelek.pdf")
system("dot -Tpng reproduce-donlelek.dot -o reproduce-donlelek.png")

Results

Here I have reproduced the work of donlelek

reproduce-donlelek.png

Future directions

  • I’d like to make the edges implicit, so that the spreadsheet keeps track of the information about the causal process, but the graph just shows the lines connecting the nodes
  • The edges are where the action is, so I need to add a direction of effect. This would be in a label column and added in a [ label = ‘abc’ ] tag for each edge
  • the rankdir option is LR to make this go sideways, which seems more the norm for causal DAGs, left to right.

References

Greenland, S., Pearl, J., & Robins, J. M. (1999). Causal diagrams for
epidemiologic research. Epidemiology (Cambridge, Mass.), 10(1),
37–48. doi:10.1097/00001648-199901000-00008
 
Reid, C. E., Snowden, J. M., Kontgis, C., & Tager, I. B. (2012). The
role of ambient ozone in epidemiologic studies of heat-related
mortality. Environmental Health Perspectives, 120(12),
1627–30. doi:10.1289/ehp.1205251
   
Newell, B., & Wasson, R. (2001). Social System vs Solar System: Why
Policy Makers Need History. In: Conflict and Cooperation related to
International Water Resources : Historical Perspectives. In World
Water (Vol. 2002).

Appendix

Code:

#####################################################################
# The following output is automatically created by newnode()
# NOTE for some reason, to show on the blog, I had to replace all { braces with normal (
#####################################################################
digraph transformations (
 
"Metritis" -> "Fertility effects"
"Cistic Ovarian Disease" -> "Fertility effects"
"Age" -> "Fertility effects"
"Fertility effects"  [ shape=record, label="(( ( Name | Description ) | ( Fertility effects |  ) ))"]
"Fertility effects" -> "Fertility"
 
 
"Metritis" -> "Cistic Ovarian effects"
"Retained Placenta" -> "Cistic Ovarian effects"
"Age" -> "Cistic Ovarian effects"
"Cistic Ovarian effects"  [ shape=record, label="(( ( Name | Description ) | ( Cistic Ovarian effects |  ) ))"]
"Cistic Ovarian effects" -> "Cistic Ovarian Disease"
 
 
"Retained Placenta" -> "Metritis effects"
"Metritis effects"  [ shape=record, label="(( ( Name | Description ) | ( Metritis effects |  ) ))"]
"Metritis effects" -> "Metritis"
 
 
 "Age" -> "Retained Placenta effects"
"Retained Placenta effects"  [ shape=record, label="(( ( Name | Description ) | ( Retained Placenta effects |  ) ))"]
"Retained Placenta effects" -> "Retained Placenta"
 
 
 )

Posted in  disentangle


blog comments powered by Disqus