Welcome to my Open Notebook

This is an Open Notebook with Selected Content - Delayed. All content is licenced with CC-BY. Find out more Here.

ONS-SCD.png

nectar cloud pumilio build got bogged down

I’ve been trying to build pumilio bioacoustics server on a Aust Nectar Research Cloud VM, but hit various roadblocks.

  • built with NeCTAR Ubuntu 12.04.2 (Precise)
  • these issues (especially the apt-get install r-cran-mysql etc ) might not occure with later Ubuntu?
  • test NeCTAR Ubuntu 12.10 (Quantal) or
  • NeCTAR Ubuntu 13.04 (Raring) ??
  • TODO FIX issues with python audio lab
  • TODO swapfile
  • TODO Mount the larger storage
  • TODO add swapfile
  • TODO fix R install etc

The Github code (in the gh-pages branch) and published as a report at this link. The Source Code is available to Clone or Fork at This Github Repo and is at a stage that most of the installation and configuration is documented to a point where a test sound file has successfully been uploaded.

HELP WANTED

  • Any researcher at an Australia university could follow the instructions on their own Nectar Cloud VM.
  • If anybody out there is interested in bioacoustics, R, Linux or web data archives please help

Background to pumilio, bushfm and this project

birdcombined.png

  • The http://pumilio.sourceforge.net/ is by Luis J. Villanueva-Rivera and Bryan C. Pijanowski. (2012. Pumilio: A Web-Based Management System for Ecological Recordings. Bulletin of the Ecological Society of America 93: 71-81. doi: 10.1890/0012-9623-93.1.71)
  • The http://www.bush.fm/ aims to provide the research community with a portal to national scale acoustic sensor data repositories, and a suite of tools to perform analysis and reporting on these data.

(Images from http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0000M4)

Posted in  cloud building


really-useful-r-upcase-string

Here is a really useful R snippet from http://stackoverflow.com/a/6364905 with a minor modification to allow differnt splits

Code:r-upcase-string

x <- c("The", "quick", "Brown", "fox/lazy dog")
 
simpleCap <- function(x, tosplit = " ") {
  s <- strsplit(x, tosplit)[[1]]
  paste(toupper(substring(s, 1,1)), substring(s, 2),
      sep="", collapse=tosplit)
}
sapply(x, simpleCap)
sapply(x, simpleCap, tosplit = "/")

Posted in  research methods


pumilio-bushfm-test-dev-prod

Testing the pumilio-bushfm-test-dev-prod build process, in an Open Notebook

Aims:

It was suggested I could document the pumilio test build as an OpenNotebook. I imagined that I could link this blog to github repo and doco hosted on gh-pages.

Methods:

Results:

  • I did a test build a month ago on an old laptop sitting around, then rebuilt on the Nectar cloud
  • Unfortunately I didn’t realise that the Nectar VM had mounted /var on to the smaller root partition (the 40GB 2nd disk is on /mnt).
  • then when I tried to upload a big sound file it broke :-(
  • I did a bit of reading and whilst I began thinking I’d just need to move the mysql datadir via

Code:

sudo nano /etc/mysql/my.cnf

BUT

  • it actually looks like there is a WAV and MP3 file under /var/www/pumilio-2…
  • so I think I can just remount /var onto the larger /mnt secondary disk.

Posted in  cloud building


India Part 1. Society for Natal Effects on Health (SNEHA) Suicide Prevention, Chennai India

Introduction

I was grateful to recieve the Bhati Family Travel Grant which allowed me to travel to Chennai, India in October to meet with experts on Drought and Suicide there. I wanted to go to India to help me with my work on Farmer Suicide in Australia primarily because I’d read about the excellent researchers there and also the large magnitude of the problem with farmer suicide and drought (it is in the news media quite regularly).

I arrived on 2nd October and the airport and streets were pretty quiet. I think because this is a public holiday in honour of Gandhi.

I had previously arranged to meet

This is my re-written notes of both meetings.

First Meeting: 3rd October, Dr Lakshmi Vijayakumar

Dr Lakshmi Vijayakumar is a WHO expert, a practicing psychiatrist and founder of the SNEHA suicide prevention organisation.

sneha-ad.jpg

When I contacted her to ask for a meeting she invited me to come to the headquarters of SNEHA in R A Puram, south central Chennai. The Autorickshaw driver took a couple of wrong turns but we got there OK after asking directions. I also met Jyothi the SNEHA Director and several volunteers who work at the centre and befriend people who visit or phone with suicidal problems.

sneha-cropped.jpg

I had asked to visit because I’ve been following her research on farmer suicide prevention measures. I suggested that my research on Drought-specific causality might benefit from insights that Dr Vijayakumar could give me, and thankfully she agreed to give me her time. I was very pleased she mentioned that she found my research “interesting and important” which is great because that is just the way I feel about HER research.

I reflected on my way to the meeting what I want to get out of this lesson and I felt that it was really important to get feedback on some of the directions I want to pursue in terms of a unified theoretical framework across environmental and social causes of suicide (generalised across diverse social situations like the differences between Australian and Indian Farmers).

I also felt it was important that I might just sit and listen to any advice Dr Vijayakumar felt like giving me, so I asked what she thought were the most important priorities for farmer suicide research in India. I have been reflecting on the influence of Dr Vijayakumar’s words had on me when I heard them on a Radio report by Australian Journalist Michael Condon Ref when he came to India in 2009.

My Secret Agenda

I brought along a secret agenda which was to get an opinion of my preconceived thoughts about the topic. Particularly the approach to a generalised framework of the work of my PhD supervisors Colin, Phil and Rohan for disentangling environmental from social impacts on health. Especially this framework is for understanding tipping points in the farmer suicide system. My second secret agenda was that if the framework I presented was thought OK, then what were the best next steps (plan for the future), in terms of communication in suicide prevention fields, actions in policy arenas and etc.

First: I presented the general framework. The essence of this is the “Five Capitals”:

  • Financial
  • Social
  • Physical
  • Natural
  • Human

Phil and Rohan have developed approaches aimed at Rural Livelihoods and Adaptive Capacity of Farmers. Colin had extended this even to the concept of Human Carrying Capacity. I think this framework especially useful for tipping points because the causes of such catastrophes are undoubtedly numerous, interacting and dynamic.

So what did Lakshmi think? She started by saying that it made good sense, that the main focus of suicide prevention work was focused on the biological, psychological and social domains. But there are so many other factors. Drought was clearly one.

She got out a map of the recent suicide rates and showed me a region of elevated risk north-west of Tamil Nadu, south of Chatinargarth, and north of Kerala. 60 percent of India’s farmer suicides are in these states. Why are these 3 states high but TN is not? All are semi-arid.

All have similar societies, cultures, climates, agriculture, etc. We tried fitting these to the five capitals framework.

1 dev-table

CapitalSemi-arid suicide hotspotTamil Nadu
Financialmove from rice/millet/polycultureremained polyculture
into cashcrops, cotton, chilli
with more pesticide, decreased access
to local money lender and more reliance
on international corporationsretained local lending,
more responsive to seasonal
conditions

Posted in  research methods


What Do Scientists Who Write Metadata Use To Do It? And Why?

  • The extent to which scientists write metadata is probably lower than it ought to be
  • The level of metadata written during science projects is probably described generally as ‘bare-minimum’ and “the minimum needed for one-self to come back to and understand what one did”
  • It sometimes seems that even the bare minimum for one-self is not being kept very often
  • I argue that the reasons for less-than-adequate metadata can be understood by looking at
  • 1) the culture of the scienctists displinary background via training
  • 2) the tools available and
  • 3) institutional requirements to produce metadata (both about data or access to data)
  • In my ongoing series of blog posts I am exploring the tools available.
  • In this post I just wanted to start the discussion about discipline culture and institutional requirements.

Discipline Culture

  • I trained in Geography in the age of GIS and this community uses metadata a lot
  • Due to the prevalance of the digital map (collection of layers) which is a derivative data output
  • Need to know the source of all the layers
  • first law of GIS is “garbage in, garbage out”
  • I was trained in the ANSLIC standard from the start
  • ArcGIS has a tool called ArcCatalog which makes metadata easy to create and view

Institutional Requirements

  • The ARC and NHMRC say they are going to require more metadata (and even data deposit)
  • Restrictions on data access make it necessary to describe at least the metadata around provision agreements, licence, allowable access
  • A supporting management level who value the metadata as research output (alongside a peer reviewed paper metadata pales in comparison)
  • My old boss used to say “Work Not Published Is Work Not Done”.

This reminds me of Approaches and Barriers to Reproducible Research

  • In 2011 BiostatMatt (Matt Shotwell) published a survey of biostatisticians VUMC Dept. of Biostatistics to assess:
  • the prevalence of fully scripted data analyses
  • the prevalence of literate programming practices

To assess the perceived barriers to reproducible research the also asked:

What The biggest obstacle to always reproducibly scripting your work?

| Barrier                                                  | Staff | Faculty |
|----------------------------------------------------------+-------+---------|
| No signifcant obstacles.                                 |     8 |      10 |
| I havent learned how.                                    |     0 |       0 |
| It takes more time.                                      |     7 |       7 |
| It makes collaboration difficult (eg. file compatibility)|     4 |       2 |
| The software I use doesnt facilitate reproducibility.    |     0 |       0 |
| Its not always necessary for my work to be reproducible. |     2 |       0 |
| Other                                                    |     2 |       1 |
|----------------------------------------------------------+-------+---------|

So what about the Approaches and Barriers to Me Writing Metadata?

With a sample size of one I asked myself these questions:

| Q                                                  | A                                                                    |
|----------------------------------------------------+----------------------------------------------------------------------|
| Do I fully document data (to a metadata standard?) | Occasionally, using DDI for high value raw inputs and final products |
| Do I employ data documentation practices           | I use a tool I created to write minimal metadata occasionally        |
| What are the main barriers?                        | takes more time, The software doesnt facilitate, not always necessary|

Conclusions

  • The tools need to help write metadata
  • the Institution needs to require metadata

References

  • Shotwell, M.S. and Alvarez, J.M. 2011. Approaches and Barriers to Reproducible Practices in Biostatistics. http://biostatmatt.com/uploads/shotwell-interface-2011.pdf

Posted in  Data Documentation