Welcome to my Open Notebook

This is an Open Notebook with Selected Content - Delayed. All content is licenced with CC-BY. Find out more Here.

Australian Postal Areas In Geographical Vs Projected Coordinates

Notes re an old project “POA2006_centroids transformations 2010-08-06”
https://alliance.anu.edu.au/access/content/group/bf77d6fc-d1e1-401c-806a-25fbe06a82d0/PostGIS%20wiki%20files/POA2006_centroid/POA2006_centroids_transformations_doc.html
compared population weighted centroids with Geographic POA centroids
Postal Areas (POA) are not the same as Postcodes! See this fact sheet
But my old work gives the total areas as 695.6 square decimal degrees
This compares with the area of Australia at 7.692 million sq kms.
So this update uses Geoscience Australia Lambert to avoid distorting the area
http://spatialreference.org/ref/epsg/3112/

Download

auspoa06_geocentroids_lambert_20160624.csv

'name:poa06-area-lambert'
setwd("~/projects/POA_centroids/POA2006_centroids")
library(swishdbtools)
ch <- connect2postgres2("delphe")

fout_geo=dbGetQuery(ch,
'select poa_2006,
  st_area(st_transform(the_geom, 3112))/1000000 as Geoscience_Australia_Lambert_area_km2,
st_x(st_centroid(st_transform(the_geom,3112))) as geocentx,
st_y(st_centroid(st_transform(the_geom,3112))) as geocenty
from abs_poa.auspoa06')
str(fout_geo)
sum(fout_geo$geoscience_australia_lambert_area_km2)
write.table(fout_geo,'data_derived/auspoa06_geocentroids_lambert_20160624.csv',
            row.names=F, sep=',')

plot(fout_geo[,3:4])
head(fout_geo)
nrow(fout_geo)
2507

Posted in spatial

24 Jun 2016

spatial-lag-and-timeseries-model-with-nmmaps-UPDATE

Re my old post: /2013/10/spatial-lag-and-timeseries-model-with-nmmaps
Today I updated my repo from just looking at spatiotemporal regression to now also look at multilevel (aka mixed-effects/random-effects) models
The new site is based on the minimal theme by orderedlist: http://ivanhanigan.github.io/spatiotemporal-multilevel-models
That also means I’ve had to move some of my old codes to the now location
http://ivanhanigan.github.io/spatiotemporal-multilevel-models/spatiotemporal-multilevel-models.html

Posted in spatial dependence

22 May 2016

Judging the evidence part 2

I previously reported on a lecture slide deck called ‘Judging the Evidence’ by Adrian Sleigh for a course PUBH7001 Introduction to Epidemiology, April 30, 2001. http://ivanhanigan.github.com/2015/11/judging-the-evidence-using-a-literature-review-database

I have also now extracted several slides into a template outline for reviewing epidemiological and other research.

Adrian Sleigh’s Protocol

Object of Study, Hypotheses or Research Questions

Purpose of Study: Objectives of study; why was it done?
Reference Population:
- To whom do authors generalize results?
- To whom should the findings be generalized?

Sampling

From the Reference Pop (target population) ->

Source Pop -> Eligible population

` The source population may be defined directly, as a matter of defining its membership criteria; or the definition may be indirect, as the catchment population of a defined way of identifying cases of the illness. The catchment population is, at any given time, the totality of those in the ‘were-would’ state of: were the illness now to occur, it would be ‘caught’ by that case identification scheme Source: Miettinen OS, 2007 http://www.teachepi.org/documents/courses/fundamentals/Pai_Lecture6_Selection%20bias.pdf `

Sample Pop:

Refusals, Dropouts
Participants -> Study Pop

Design of study

Study setting: Where and when was the study done? What were the circumstances? Ethics?
Type of study: Experimental vs natural, descriptive vs analytical (trial, cohort, case-control, prevalence, ecological, case-report, etc). If case-control or cohort, was the timing of data collection retrospective or prospective?
Subjects: Who (number, age, sex, etc.)? How were they selected?
Comparison groups: What control group or standard of comparison? How appropriate?
Study size: Was the sample size adequate to give you confidence in the finding of “no association
Bias and Confounding
- a) Selection bias: Were groups comparable for subjects who entered and stayed in study? Selection influenced by exposure (c-c) or effect (cohort) under study? Drop-outs?
- b) Confounding: Control of potential confounding variables in design of the study - matching or subject restriction?

Observations

Procedure: How are the variables in the study defined and measured, ie how were data collected?
Definition of terms: Are definitions of diagnostic criteria, measurements and outcome unambiguous? Could be reproduced?
Bias and Confounding
- a) Observation bias: Were study groups comparable for measurements or mode of observation? Mis-classification in determining exposure or disease categories? Differential between groups, or ‘random’?
- b) Confounding: Information recorded on variables that could confound the association under study (to permit adjustment in the analysis)?

THANKS Prof Sleigh!

Posted in bibliometrics and literature reviewing

10 May 2016

R base graphics are fine except barplot

I concur with Jeff Leek that once spent time learning base graphics in R there is less incentive to learn ggplot2 http://simplystatistics.org/2016/02/11/why-i-dont-use-ggplot2/

However I always hate the way barplot works. Here is an example:

qc <- read.csv(textConnection("id,  OnlinePaper, Q, freq, totals,       prop
1,      Online,         ,1768,   9950, 0.17768844
2,      Online,      No ,4022,   9950, 0.40422111
3,      Online,     Yes ,4160,   9950, 0.41809045
4,       Paper,         , 256,   3355, 0.07630402
5,       Paper,      No , 979,   3355, 0.29180328
6,       Paper,     Yes ,2120,   3355, 0.63189270"))

qc1 <- cast(qc, OnlinePaper ~ Q74 , value = "prop")
qc1
barplot(as.matrix(qc1), beside = T, legend.text = qc1[,1], ylim = c(0,1))

/images/barplot_base.png

ggplot(data=qc, aes(x=Q, y=prop, fill=OnlinePaper)) +
    geom_bar(stat="identity", position=position_dodge())

/images/barplot_gg.png

Going to extremes

I should say though that I have found barplot can produce very customised graphs that serve a specific purpose such as that below (I have de-identified the content as this is unpublished research)

/images/barplot-gonuts.png

This made heavy use of the following approach

# original by Joseph Guillaume 2009

SideBySideBarPlot2 <- function(aggAllData, ...) {
  par(mar=c(8,7,4,2))
  bp<-barplot(aggAllData,
              horiz=FALSE,
              col=gray.colors(nrow(aggAllData)),
              las=1, axisnames = FALSE, ...)
  labels <- names(as.data.frame(aggAllData))
  text(bp, par('usr')[3], labels = labels, srt = 45, 
       adj = c(1.1,1.1), xpd = TRUE, cex=.9)
    return(bp)
}
# with width = xvar (proportions)

Posted in exploratory data analysis

19 Feb 2016

r-syntax-highlights-for-my-jekyll-powered-blog.md

Syntax Highlights

Until today I had no idea how to make code pretty in my blog posts which go to github after being first rendered locally so I can get the categories and tags.

Because github disables any plugins when it processes your blog I took Charlie Park’s advice. http://charliepark.org/jekyll-with-plugins/

This blog post solved it for me http://tuxette.nathalievilla.org/?p=1574

The trick is to write highlighter: pygments into the _config.yml and then:

% highlight r % # with curly braces
data("iris")
plot(iris$Sepal.Length ~ iris$Sepal.Width)
dat <- rnorm(1000,1,2)
% endhighlight % # with curly braces

Will render as:

data("iris")
plot(iris$Sepal.Length ~ iris$Sepal.Width)
dat <- rnorm(1000,1,2)

But I also pushed this to another site that I do use gh-pages to build and it sent me an email complaining:

You are attempting to use the 'pygments' highlighter, 
which is currently unsupported on GitHub Pages. 
Your site will use 'rouge' for highlighting instead. 
To suppress this warning, change the 'highlighter' value to 
'rouge' in your '_config.yml'. 

So there.

Posted in

14 Feb 2016

« Previous Next »

Welcome to my Open Notebook

Australian Postal Areas In Geographical Vs Projected Coordinates

Download

spatial-lag-and-timeseries-model-with-nmmaps-UPDATE

Judging the evidence part 2

Adrian Sleigh’s Protocol

Object of Study, Hypotheses or Research Questions

Sampling

Sample Pop:

Design of study

Observations

THANKS Prof Sleigh!

R base graphics are fine except barplot

Going to extremes

r-syntax-highlights-for-my-jekyll-powered-blog.md

Syntax Highlights

About

Recent Entries

Categories

Entries grouped by Tags

Welcome to my Open Notebook

Australian Postal Areas In Geographical Vs Projected Coordinates

Download

spatial-lag-and-timeseries-model-with-nmmaps-UPDATE

Judging the evidence part 2

Adrian Sleigh’s Protocol

Object of Study, Hypotheses or Research Questions

Sampling

Sample Pop:

Design of study

Observations

THANKS Prof Sleigh!

R base graphics are fine except barplot

Going to extremes

r-syntax-highlights-for-my-jekyll-powered-blog.md

Syntax Highlights

Subscribe

About

Recent Entries

Categories

Entries grouped by Tags