I’ve learnt this lesson multiple times. It happens like this. A solution is not immediately obvious in R so you might think of writing your own function. Generally there is a solution you just did not google enough.
This time I was tricked a little because the GIS functions have been bad for a long time but getting better very rapidly recently. A little while ago I had a very successful
outcome from using the raster::extract
function on a large raster file
to get the attributes for a set of points. I needed to do the same
thing but this time for a shapefile and points. I looked at the
raster package and saw you can use the raster::intersect
function
here, and it worked on the small sample data I tested with but failed
with the big dataset as it ran out of memory. I assumed that R had not caught up with the GIS world yet and so I came up with this workaround below by splitting the points data layer into chunks.
I then got access to ArcMap and was wondering whether it could do it, and it DID! So then I googled a bit and found the solution was simple:
Code:
sp::over()
Here is my hack in case I ever need to pull out the bit that does the splitting up of the points file, or the tryCatch():
Code:
big_pt_intersect <- function(pts, ply, chunks = 100){
idx <- split(pts@data, 1:chunks)
#str(idx)
for(i in 1:length(idx)){
#i = 1
print(i)
ids <- idx[ [i] ][,1]
#str(pts@data)
qc <- pts[pts@data[,1] %in% ids,]
#str(qc)
tryCatch(
chunk <- raster::intersect(qc, ply),
error = function(err){print(err)})
if(!exists('chunk_out')){
chunk_out <- chunk@data
} else {
chunk_out <- rbind(chunk_out, chunk@data)
}
rm(chunk)
}
#str(chunk_out)
return(chunk_out)
}
# NB warning about split length multiple is not fatal, just due to nonequal chunks
# (ie the geocodes are 2009/100)