Testing Aekos Data Portal

Table of Contents

AEKOS Test 1

AEKOS Test 1

ivan.hanigan@anu.edu.au

Introduction

This is a report of the first attempt to play with the new Australian Terrestrial Ecosystem Research Network's data portal 'AEKOS'
I have to admit at the outset I am not that interested in searching for actual data, just interested in how AEKOS exposes the data.
This is a Reproducible Research Report, written using Emacs orgmode to construct a knitr document and creates a HTML output. I've made the code available on github here

Methods

Download from portal

went to the website and tried a few queries until I found something with data, added to cart and proceeded to checkout
NB I don't even remember what search criteria I put together to get these data. The first couple I tried (focused on Drought) did not find any data. I think this was about some species or other.
got an email with link to download the data
downloaded zip to my ~/data/aekos_tests directory
unzipped and now have some SQL setup files and some pdfs
I can now see that the appropriate citation is

State of South Australia (Department of Environment, Water and Natural Resources, South Australia) (2012)
Dalhousie Survey (Scientific Expedition Group) Survey, South Australian Biological Survey Program, Version 1 /2014. <em>State of South Australia
(Department of Environment, Water and Natural Resources, South Australia)</em>, Adelaide, South Australia. Obtained via ÆKOS Data Portal (<a
href="http://www.portal.aekos.org.au/" target="_blank" >http://www.portal.aekos.org.au/</a>) at TERN Eco-informatics, The University of Adelaide.
Accessed [<em>dd mmm yyyy</em>, e.g. 01 Apr 2010].

Load to postgres database

I run postgres so I didn't need to install it but it is good that there are instructions here if you do need to.
I went straight to the linux step 8 (page 6) “Create the database schema from the downloaded schema”

psql -U postgres -d aekos -h localhost -p 5432 -f _DatabaseImport-PostgreSQLSchemaUpdater.sql

then on step 10 “Establish the relationships between the tables:

psql -U postgres -d aekos -h localhost -p 5432 -f relationships_PostgreSQL.sql

Now I have a database…. so what do I do now?

Exploratory Data Analysis

Load R packages

Load some useful tools I made for exploring data with R and PostgreSQL

if (!require(swishdbtools)) {
    require(devtools)
    install_github("swishdbtools", "swish-climate-impact-assessment")
} else {
    require(swishdbtools)
}

if (!require(disentangle)) {
    require(devtools)
    install_github("disentangle", "ivanhanigan")
} else {
    require(disentangle)
}

if (!require(gisviz)) {
    require(devtools)
    install_github("gisviz", "ivanhanigan")
} else {
    require(gisviz)
}

List Tables

The first thing I usually do is look at the list of tables available

require(swishdbtools)
require(xtable)
ch <- connect2postgres2("aekos_tests")
lst <- pgListTables(ch, "public")
nrow(lst)

[1] 95

# there are a lot.  here is the head
tbl <- xtable(head(lst))
print(tbl, type = "html", include.rownames = F)

relname	nspname
address	public
aerialimagecoordinate	public
aerialphotograph	public
ageclass	public
agevalue	public
areavalue	public

Variable Distributions

choose a table and check it out

lst <- pgListTables(ch, "public")
tbl <- sample(lst[, 1], 1)
print(tbl)

[1] "vct_text_addressline”

# First time I tried this I retreived [1] 'crownseparationratiovalue' from
# the _TableDefinitions.pdf I can see this is defined as

### 'Average distance between the edges of the crowns divided by the average
### width of the crowns'

# with only two cols, boring, try EXTRACTSPECIESPRESENCE
tbl <- tolower("EXTRACTSPECIESPRESENCE")
dat <- dbGetQuery(ch, sprintf("select *\n  from %s\n  ", tbl))
# str(dat) what is the code for speciesname?

Identify the relational tables

I was not using the DbVisualizer tool before so I had to install it
this lets me easily see which tables are related to this table/variable.names
I think pgAdmin > Reports > Dependents report could have been used to figure out what the related tables/variable.names were, but it would have been a bit harder.

Summarise the variable distributions

# from this I can see I need to get speciesname from speciesconcept
dat <- dbGetQuery(ch, "select *\nfrom extractspeciespresence tb1\n left join speciesconcept tb2\n on tb1.speciesname = tb2._pk_id")
# str(dat) dat

dd <- data_dictionary(dat, show_levels = 6)
print(xtable(dd), type = "html", include.rownames = F)

Variable	Type	Attributes	Value	Count	Percent
_pk_id	character	EXTRACTSPECIESPRESENCE_T1399266119935		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119945		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119952		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119959		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119966		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119973		1	0.74
		more than 6 levels. list truncated.
abundanceunits	character	MEASUREMENTUNIT_T1399259767120		135	100
aekoslink	character	aekos.org.au/collection/sa.gov.au/bdbsa_veg/survey_161/19536		11	8.15
		aekos.org.au/collection/sa.gov.au/bdbsa_veg/survey_161/19537		12	8.89
		aekos.org.au/collection/sa.gov.au/bdbsa_veg/survey_161/19538		12	8.89
		aekos.org.au/collection/sa.gov.au/bdbsa_veg/survey_161/19541		14	10.37
		aekos.org.au/collection/sa.gov.au/bdbsa_veg/survey_161/19543		22	16.3
		aekos.org.au/collection/sa.gov.au/bdbsa_veg/survey_161/19544		8	5.93
		more than 6 levels. list truncated.
location	character	SPATIALPOINT_1400572564352		1	0.74
		SPATIALPOINT_1400572564353		1	0.74
		SPATIALPOINT_1400572564354		1	0.74
		SPATIALPOINT_1400572564355		1	0.74
		SPATIALPOINT_1400572564356		1	0.74
		SPATIALPOINT_1400572564357		1	0.74
		more than 6 levels. list truncated.
locationreliability	character	LOCATIONRELIABILITY_T1399266119936		1	0.74
		LOCATIONRELIABILITY_T1399266119946		1	0.74
		LOCATIONRELIABILITY_T1399266119953		1	0.74
		LOCATIONRELIABILITY_T1399266119960		1	0.74
		LOCATIONRELIABILITY_T1399266119967		1	0.74
		LOCATIONRELIABILITY_T1399266119974		1	0.74
		more than 6 levels. list truncated.
lowerabundance	number	Min.	0
		1st Qu.	0
		Median	0
		Mean	0
		3rd Qu.	0
		Max.	0
speciesname	character	SPECIESCONCEPT_T1399262855831		1	0.74
		SPECIESCONCEPT_T1399262856463		4	2.96
		SPECIESCONCEPT_T1399262856593		2	1.48
		SPECIESCONCEPT_T1399262857040		1	0.74
		SPECIESCONCEPT_T1399262857772		5	3.7
		SPECIESCONCEPT_T1399262857780		1	0.74
		more than 6 levels. list truncated.
timeend	date	Min.	2003-07-20
		1st Qu.	2003-07-21
		Median	2003-07-21
		Mean	2003-07-21
		3rd Qu.	2003-07-22
		Max.	2003-07-22
timestart	date	Min.	2003-07-20
		1st Qu.	2003-07-21
		Median	2003-07-21
		Mean	2003-07-21
		3rd Qu.	2003-07-22
		Max.	2003-07-22
upperabundance	number	Min.	0
		1st Qu.	0
		Median	0
		Mean	0
		3rd Qu.	0
		Max.	0
_pk_id	character	EXTRACTSPECIESPRESENCE_T1399266119935		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119945		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119952		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119959		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119966		1	0.74
		EXTRACTSPECIESPRESENCE_T1399266119973		1	0.74
		more than 6 levels. list truncated.
authorship	character	A.Cunn. ex Benth.		2	1.48
		(Benth.) Paul G.Wilson		2	1.48
		Blakely &amp Jacobs		1	0.74
		(Boiss.) D.C.Hassall		2	1.48
		DC.		8	5.93
		Domin		1	0.74
		more than 6 levels. list truncated.
genuspart	character	Acacia		11	8.15
		Aristida		1	0.74
		Atriplex		6	4.44
		Chamaesyce		2	1.48
		Cucumis		1	0.74
		Cyperus		3	2.22
		more than 6 levels. list truncated.
hybridtype	character	SPECIESCONCEPTHYBRIDTYPE_T1399259772337		135	100
indexnames	character	SPECIESCONCEPT_T1399262855831		1	0.74
		SPECIESCONCEPT_T1399262856463		4	2.96
		SPECIESCONCEPT_T1399262856593		2	1.48
		SPECIESCONCEPT_T1399262857040		1	0.74
		SPECIESCONCEPT_T1399262857772		5	3.7
		SPECIESCONCEPT_T1399262857780		1	0.74
		more than 6 levels. list truncated.
infraspecificepithet	character	arida		1	0.74
		dielsii		2	1.48
		glabra		8	5.93
		holathera		1	0.74
		microcephala		1	0.74
		nummularia		4	2.96
		more than 6 levels. list truncated.
namewithauthorship	character	Acacia ligulata A.Cunn. ex Benth.		2	1.48
		Acacia salicina Lindl.		8	5.93
		Acacia victoriae subsp. arida		1	0.74
		Aristida holathera Domin var. holathera		1	0.74
		Atriplex holocarpa F.Muell.		2	1.48
		Atriplex nummularia Lindl. subsp. nummularia		4	2.96
		more than 6 levels. list truncated.
namewithoutauthorship	character	Acacia ligulata		2	1.48
		Acacia salicina		8	5.93
		Acacia victoriae subsp. arida		1	0.74
		Aristida holathera var. holathera		1	0.74
		Atriplex holocarpa		2	1.48
		Atriplex nummularia subsp. nummularia		4	2.96
		more than 6 levels. list truncated.
rank	character	TAXONOMICRANK_T1399259772301		118	87.41
		TAXONOMICRANK_T1399259772302		6	4.44
		TAXONOMICRANK_T1399259772304		11	8.15
speciesconcepttype	character	SPECIESCONCEPTTYPE_T1399259772313		102	75.56
		SPECIESCONCEPTTYPE_T1399259772322		16	11.85
		SPECIESCONCEPTTYPE_T1399259772325		8	5.93
		SPECIESCONCEPTTYPE_T1399259772327		9	6.67
specificepithet	character	appressa		2	1.48
		argenteus		1	0.74
		australis		7	5.19
		billardierei		8	5.93
		constricta		1	0.74
		coolabah		1	0.74
		more than 6 levels. list truncated.
vernacularname	character	SPECIESCONCEPT_T1399262856463		4	2.96
		SPECIESCONCEPT_T1399262856593		2	1.48
		SPECIESCONCEPT_T1399262857040		1	0.74
		SPECIESCONCEPT_T1399262857772		5	3.7
		SPECIESCONCEPT_T1399262858298		1	0.74
		SPECIESCONCEPT_T1399262858442		2	1.48
		more than 6 levels. list truncated.

it is weird that the lowerabundance and upperabundance are both zero, must be because these are species presence data
but it would be better for this to be NA

Spatial Data

I can also see from the DbVisualizer graph that the spatialpoint table has the x and y coordinates for the location variable
I then also had to find the table with the coordinatereferencesystem (crs)
The code to make the plot is next and the plot of the spatial data is below.

#### name:spatial####
require(swishdbtools)
require(gisviz)
require(sqldf)
ch <- connect2postgres2("aekos_tests")
lst <- pgListTables(ch, "public")
#lst 
tbl <- lst[grep("spatial", lst[,1]),]
pts  <- dbGetQuery(ch, paste("select * from ", tbl))
# str(pts)
# t(pts[1,])
  dat <- dbGetQuery(ch,
  "select *

  )

pts <- dbGetQuery(ch,
"select tb1._pk_id as id, namewithoutauthorship, x, y, tb4.name
 from extractspeciespresence tb1
 join
 speciesconcept tb2
 on tb1.speciesname = tb2._pk_id
 join 
 spatialpoint tb3
 on tb1.location = tb3._pk_id
 join 
 coordinatereferencesystem tb4
 on tb3.crs = tb4._pk_id
")
# str(pts)
# GDA94


epsg <- make_EPSG()
# names(epsg)
crs <- epsg[which(epsg$code == 4283),'prj4']
# there are multiple species per site, so summarise first
pts <- sqldf("select x, y, count(*)
from pts
group by x, y
", drv = 'SQLite')

pts <- SpatialPointsDataFrame(pts[,c('x','y')], pts, proj4string = CRS(crs))
#str(pts)
writeOGR(pts, "spatialpoint.shp", "spatialpoint", driver= "ESRI Shapefile")

png("plot.png")
plotMyMap(pts, xl = c(110,155), yl = c(-40,-10))
title("dalhousie survey")
dev.off()

Zoom in on the point locations

using the number of species names at each location as a count

Conclusions

I really like how this model emphasises the use of databases for data management (and postgres or mysql are great options)
I think the instructions for the load process are good but with typos makes it difficult
for a casual browser this is not so much fun

Posted in Data Documentation

14 Jun 2014

releasing-my-misc-r-package-disentangle-version-1

hello world!
the associated notebook is Here
this blog will feature vignettes and documentation about the functions and use cases
Windows Version is Downloadable Here
Linux and Mac users can just run this R code

R Code

require(devtools)
install_github("disentangle", "ivanhanigan")

Posted in disentangle

11 Jun 2014

using-additional-header-rows-for-metadata

Table of Contents

Comment on eMast recommendations
- - Introduction
  - R code

Comment on eMast recommendations

ivan.hanigan@anu.edu.au

Introduction

I was lucky to be forwarded a copy of the document “DRAFT Best practices for collecting, processing and collating plant trait data” (V0.0).
What I like most about this is the statement on page nine under “3. Best practice collection techniques” that “Datasets should be maintained following two simple practices of formatting and cataloguing”.
I think the recomendations are very sensible but I have the following comment regarding the proposed file structure shown in the table

I have not used the second row for the units before, but rather have encoded this information in a second metadata file that I keep with the main data file.
This second row does seem attractive
BUT as this might be interpreted as the first row of data after the header row of column names this needs extra code to be written to allow importation to statistics packages.
While this can be easily handled by writing extra code to treat this first row separately, this does seem a bit risky to expect ordinary data users to do so.
I wonder if the intention of the authors is to include this in the column NAME rather than just in the column as the statement currently reads (“the units of measurement and an expanded definition of the data recorded in each column”) and the table shows?
For example the R code given in the Appendix “R script to aggregate unprocessed trait data into summary statistics ready for the EMP DATABASE”

R code

The eMast Document provides an interesting appendix with R codes.
The following is an attempt to make a vignette that will run with the example data provided.

Construct some fake data


dat <- read.csv(textConnection("Date    ,Latitude,Longitude,Genus,Species,Tree No.,Meas. No.,Photosynthesis,Air Temp.,Height\n         ,      oS,       oE,          ,       ,   , ,umol m-2 s-1, oC, m\n1/10/2004,-43.4444,140.1453 ,Eucalyptus,Saligna,  1,1,15.043,25.56,15\n1/10/2004,-43.4444,140.1453 ,Eucalyptus,Saligna,  1,2,15.998,25.56,15\n1/10/2004,-43.4444,140.1453 ,Eucalyptus,Saligna,  1,3,15.584,25.56,15\n"))
# write a couple of fake data files
for (i in 1:2) {
    write.csv(dat, paste("Book", i, ".csv", sep = ""), row.names = F)
}
# show me the data
print(xtable(dat), type = "html")

	Date	Latitude	Longitude	Genus	Species	Tree.No.	Meas..No.	Photosynthesis	Air.Temp.	Height
1		oS	oE					umol m-2 s-1	oC	m
2	1/10/2004	-43.4444	140.1453	Eucalyptus	Saligna	1	1	15.043	25.56	15
3	1/10/2004	-43.4444	140.1453	Eucalyptus	Saligna	1	2	15.998	25.56	15
4	1/10/2004	-43.4444	140.1453	Eucalyptus	Saligna	1	3	15.584	25.56	15

Run the aggregation of trait data program provided in eMast doc


library(stringr)
# This function summarises the data, one just needs to past the parameter of
# interest and the quantities by which it varies around.
do.sumy <- function(pars, lab, dat) {
    epars <- as.formula(pars)
    mu <- aggregate(epars, data = dat, mean)
    md <- aggregate(epars, data = dat, median)
    se <- aggregate(epars, data = dat, sd)
    mx <- aggregate(epars, data = dat, min)
    mn <- aggregate(epars, data = dat, max)
    NN <- aggregate(epars, data = dat, length)
    drp0 <- unlist(strsplit(pars, "\\+"))
    drp <- -length(drp0):-1
    df1 <- cbind(mu, md[, drp], se[, drp], mx[, drp], mn[, drp], NN[, drp])
    df2 <- cbind(df1, Parameter = lab)
    hd1 <- gsub("^\\w.*.~", "", drp0)
    names(df2) <- c(hd1, "Mean", "Median", "Std", "Min", "Max", "N", "Parameter")
    return(df2)
}

# Not Required setwd('~/where/is/the/data/?')
files <- list.files(pattern = "*.csv")
# files
data <- lapply(files, read.csv, header = T, stringsAsFactors = F, strip.white = T)

Minor modifications to make it work

I needed to write the following to read the csv with metadata row


#### Notes str(data) as expected, these are sometimes imported as character due
#### to the second row so need to fix it also because the data object is a list
#### of data.frames, it is easier to run the example if we will just create
#### individual data frames
read_metadata_csv <- function(filename) {
    dat <- read.csv(filename, skip = 1)
    col.defs <- names(read.csv(filename, nrow = 0))
    unit.defs <- read.csv(filename, nrow = 1, stringsAsFactors = F)
    attributes(dat)$unit.defs <- unit.defs[1, ]
    names(dat) <- col.defs
    return(dat)
}
# files
data <- read_metadata_csv(files[1])
# str(data)

Now finish off with the example code

#### NOTE this is not relevant here so commented out If the Genus and Species
#### is not separate, then split it up split.spp <- strsplit( data$Spp.Name, '
#### ' ) data$Genus <- sapply( split.spp, '[[', 1 ) data$Species <- sapply(
#### split.spp, '[[', 2 )



# Get the summary stats for each interesting trait
Height <- do.sumy("Height~Photosynthesis+Genus+Species", "Height", data)
# show me the data
print(xtable(Height), type = "html")

	Photosynthesis	Genus	Species	Mean	Median	Min	Max	N	Parameter
1	15.04	Eucalyptus	Saligna	15.00	15	15	15	1	Height
2	15.58	Eucalyptus	Saligna	15.00	15	15	15	1	Height
3	16.00	Eucalyptus	Saligna	15.00	15	15	15	1	Height

#### not available Vcmax <- do.sumy( 'Vcmax25~CO2.treatment+Genus+Species',
#### 'VCMAXM25', data ) Jmax <- do.sumy( 'Jmax25~CO2.treatment+Genus+Species',
#### 'JMAXM25', data ) VjVn <- do.sumy( 'J.V~CO2.treatment+Genus+Species',
#### 'VJVN', data ) Tleaf <- do.sumy(
#### 'Tleaf_avg~CO2.treatment+Genus+Species','TLEAF', data ) all.Y <- rbind(
#### Vcmax, Jmax, VjVn, Tleaf )

## # Re-arrange the columns to the correct order exp.dat1 <- subset( all.Y,
## select=c('Genus','Species','Parameter','Mean','Median','Std','Min','Max','N','CO2.treatment')
## )

## # save it somewhere write.table( exp.dat1, 'choose/folder/to/save/to.csv',
## sep=',', row.names=F, col.names=T, na='' )

Posted in Data Documentation

17 May 2014

reproducible-research-reports-using-emacs-orgmode-export-to-knitr

Emacs orgmode is so good for Reproducible Research Reports. Check out the paper by Eric Schulte et al “A Multi-Language Computing Environment for Literate Programming”
it provides a good export function to convert the code file into a html or latex report using C-c C-e h OR C-c C-e l respectively.
but I have found there are some limitations with these export reports compared to knitr
primarily the limitation of the need to include the images as seperate files whereas knitr encodes these explicitly in the report is the one I dislike the most
another limitation is the way that tables are handled
HOWEVER the code editing features of Emacs orgmode are so superior to the other options I have tried to write knitr documents in that I feel it is worth trying to find some middle ground.
I have been happy for a little while using a method of writing documents in Emacs orgmode and exporting the text and code Chunks to a knitr RMD file, and then compiling that file to create the report.
I have set up an example using the classic Challenger Space Shuttle logistic regression in this file

The Output

Table of Contents

Reproducible Research Report using Emacs orgmode to knitr

Reproducible Research Report using Emacs orgmode to knitr

ivan.hanigan@anu.edu.au

Introduction

This is an example or a Reproducible Research Report.
The report is the output of a program written in Emacs orgmode, which constructs a knitr document and creates this HTML output.

logistic regression of the challenger disaster

The problem was with the failure rate (and number of) O-rings that failed (n.fail) related to the temperature (temp).

name:create-data

# n.fail = failed o rings
# totalMinusNfail = total-n.fail
# total = there were 6 o rings for each of 23 attempts
# pfail = probability of fail pFail <- n.fail/total
# temp = temperature (F)

dat <- read.csv(textConnection(
"nfail, totalMinusNfail, total,     pFail,       temp
2,               4,     6,          0.3333333,   53
0,               6,     6,          0.0000000,   66
0,               6,     6,          0.0000000,   68
1,               5,     6,          0.1666667,   70
0,               6,     6,          0.0000000,   75
0,               6,     6,          0.0000000,   78
1,               5,     6,          0.1666667,   57
0,               6,     6,          0.0000000,   67
0,               6,     6,          0.0000000,   69
1,               5,     6,          0.1666667,   70
2,               4,     6,          0.3333333,   75
0,               6,     6,          0.0000000,   79
1,               5,     6,          0.1666667,   58
0,               6,     6,          0.0000000,   67
0,               6,     6,          0.0000000,   70
0,               6,     6,          0.0000000,   72
0,               6,     6,          0.0000000,   76
0,               6,     6,          0.0000000,   81
1,               5,     6,          0.1666667,   63
0,               6,     6,          0.0000000,   67
0,               6,     6,          0.0000000,   70
0,               6,     6,          0.0000000,   73
0,               6,     6,          0.0000000,   76
"))

nfail	totalMinusNfail	total	pFail	temp
2	4	6	0.33	53
0	6	6	0.00	66
0	6	6	0.00	68
1	5	6	0.17	70
0	6	6	0.00	75
0	6	6	0.00	78
1	5	6	0.17	57
0	6	6	0.00	67
0	6	6	0.00	69
1	5	6	0.17	70
2	4	6	0.33	75
0	6	6	0.00	79
1	5	6	0.17	58
0	6	6	0.00	67
0	6	6	0.00	70
0	6	6	0.00	72
0	6	6	0.00	76
0	6	6	0.00	81
1	5	6	0.17	63
0	6	6	0.00	67
0	6	6	0.00	70
0	6	6	0.00	73
0	6	6	0.00	76

#### learnR-logistic
png("pfail.png")
with(dat, plot(temp, pFail, pch = 16, xlim = c(40, 100), ylim = c(0, 0.4)))
title("A plot of the proportion failed by temperature")
dev.off()

pdf 2

fit model

# newnode: linear names(dat)
resp <- cbind(dat$nfail, dat$totalMinusNfail)
temp <- dat$temp
linear <- glm(resp ~ 1 + temp, family = binomial(link = logit))
# TODO this just prints the codes not the table?  xtable(linear, type =
# 'html')

Model Diagnostics

cf <- linearoutput$coeff
signif(cf[which(row.names(cf) == "temp"), "Estimate"], 2)

[1] -0.12

png("challengerLogistic.png")
par(mfrow = c(2, 2))
plot(linear)
dev.off()

pdf 2

plot-model

dummy <- data.frame(temp = seq(20, 100, 1))
pred.prob <- predict.glm(linear, newdata = dummy, type = "resp")
png("pfailfit.png")
with(dat, plot(temp, pFail, xlab = "Launch Temperature (F)", ylab = "Proportion Failing", 
    pch = 16, xlim = c(20, 100), ylim = c(0, 1)))
lines(dummy$temp, pred.prob)
dev.off()

pdf 2

Posted in research methods

17 May 2014

ms-access-to-postgresql-in-64-bit-windows

In 2011 we published a paper on “Creating an integrated historical record of extreme particulate air pollution events in australian cities from 1994 to 2007”.
I used a PostGIS/PostgreSQL server to host the database and MS Access clients for data entry by my co-author’s at their own institutions (thousands of kms away)
This worked great but recently I realised that updating to windows 64 bit has broken the ODBC connection
to fix it I followed the instructions here
You need to install both the 32 and 64 bit executables
Assuming you use the 32 bit windows office suite (recommended) then use the command below to open the ODBC connections tool and add a DSN for your postgres database

Code:

Go to Start -> Run (or Press Windows+R keys) and enter 
%WINDIR%\SysWOW64\odbcad32.exe

On either the User DSN (DSNs available for only that user) or System DSN (DSNs available to every user) tab, click the Add button. Scroll down the list and find the PostgreSQL ODBC Driver.
You may select ANSI or UNICODE (For future support I personally prefer UNICODE), and click Finish.
Data Source refers to the programmable name of the DSN Entry. Stick to lower-case letters and underscores. Ex: psql_server
Specify the Database, Server, User Name and Password to your PostgreSQL server.
Click the Test button to ensure everything has been specified correctly.
Click the Save button to create the DSN.

Posted in research methods

10 May 2014

Welcome to my Open Notebook

Testing Aekos Data Portal

AEKOS Test 1

Introduction

Methods

Download from portal

Load to postgres database

Exploratory Data Analysis

Load R packages

List Tables

Variable Distributions

Identify the relational tables

Summarise the variable distributions

Spatial Data

Zoom in on the point locations

Conclusions

releasing-my-misc-r-package-disentangle-version-1

R Code

using-additional-header-rows-for-metadata

Comment on eMast recommendations

Introduction

R code

Construct some fake data

Run the aggregation of trait data program provided in eMast doc

Minor modifications to make it work

Now finish off with the example code

reproducible-research-reports-using-emacs-orgmode-export-to-knitr

The Output

Reproducible Research Report using Emacs orgmode to knitr

Introduction

logistic regression of the challenger disaster

name:create-data

fit model

Model Diagnostics

plot-model

ms-access-to-postgresql-in-64-bit-windows

Code:

Subscribe

About

Recent Entries

Categories

Entries grouped by Tags