There are some packages that are very helpful for working with raster and spatial data. A few are listed here in decreasing order of usefulness (to me).
We have movement data on a Finnish wolf named Niki. Our goal is to 1) calculate the distance from each of Niki’s positions to the closest primary road, and 2) calculate the density (in km / km^2) of primary road within Niki’s 100% MCP.
First, we need to load a shapefile of the road data. A shapefile is “a digital vector storage format for storing geometric location and associated attribute information.” Really, a shapefile is a collection of 6-7 individual files that are stored in the same directory and have a common name. To load shape files into R, we use the function readOGR
from the package rgdal
. It’s not a problem if readOGR
throws a warning about “z dimension discarded,” since our data are 2-dimensional. While we’re at it, we load the wolf data, which has been stored as an R object.
setwd("~TimBarry/Box Sync/WolfDispersal/EliesClass")
require(sp)
require(rgdal)
load("niki.rda")
niki <- niki[!duplicated(niki[c("X","Y")]),] #eliminate duplicate points
kainuuUTM35 <- readOGR(dsn = "Road Shape File", layer = "DR_LINKKI")
## OGR data source with driver: ESRI Shapefile
## Source: "Road Shape File", layer: "DR_LINKKI"
## with 116743 features
## It has 23 fields
The variable kainuuUTM35
is an object of the class SpatialLinesDataFrame, which is basically an ordered collection of line segments with associated attributes. We can explore the important features of this object.
head(kainuuUTM35@data)
## LINK_ID LINK_MMLID KUNTAKOODI HALLINN_LK TOIMINN_LK TIENUMERO TIEOSANRO
## 0 587698 1110519748 105 3 6 NA NA
## 1 587699 1110523754 105 3 5 NA NA
## 2 587700 312275776 105 3 5 NA NA
## 3 587701 597011941 105 3 5 NA NA
## 4 587702 312275782 105 3 5 NA NA
## 5 587703 598276179 105 99 7 NA NA
## LINKKITYYP SILTA_ALIK AJOSUUNTA TIENIMI_SU TIENIMI_RU TIENIMI_SA
## 0 3 0 2 <NA> <NA> <NA>
## 1 3 0 2 Kehä IV <NA> <NA>
## 2 3 0 2 Kehä IV <NA> <NA>
## 3 3 0 2 Kehä IV <NA> <NA>
## 4 3 0 2 Kehä IV <NA> <NA>
## 5 12 0 2 <NA> <NA> <NA>
## ENS_TALO_O ENS_TALO_V VIIM_TAL_O VIIM_TAL_V MUOKKAUSPV SIJ_TARK
## 0 NA NA NA NA 08.01.2015 00:00:00 3000
## 1 2 1 2 3 08.01.2015 00:00:00 3000
## 2 NA NA NA NA 08.01.2015 00:00:00 3000
## 3 NA NA NA NA 08.01.2015 00:00:00 3000
## 4 4 5 10 5 08.01.2015 00:00:00 3000
## 5 NA NA NA NA 02.07.2015 17:20:21 3000
## KOR_TARK ALKU_PAALU LOPP_PAALU GEOM_FLIP
## 0 100001 0 57.476 0
## 1 100001 0 114.605 0
## 2 100001 0 17.342 0
## 3 100001 0 6.014 0
## 4 100001 0 70.150 0
## 5 100001 0 128.385 1
This data frame gives the attribute data for each line segment, with each row corresponding to a different segment. We are particularly interested in the attribute called “TOIMINN_LK,” which means “functional link” in Finnish. For our purposes, codes 1-5 represent primary roads, codes 6-7 represent forest roads, and code 8 represents paths.
Let’s determine the coordinate reference system of this road object.
kainuuUTM35@proj4string
## CRS arguments:
## +proj=utm +zone=35 +ellps=GRS80 +units=m +no_defs
We can see that the reference system is UTM zone 35. Unfortunately, our wolf data are in a different coordinate reference system. The proj4string of our wolf data happens to be “+proj=tmerc +lat_0=0 +lon_0=27 +k=1 +x_0=3500000 +y_0=0 +ellps=intl +units=m +no_defs”. We must project our road data into the wolf’s coordinate reference system before we can do anything. We use the function spTransform
from sp
to do this. (Note that we could have transformed the wolf observations into UTM zone 35 instead; I don’t like doing this, since I like using the animal’s coordinate reference system as the “base” system.)
kainuuTrans <- spTransform(kainuuUTM35, CRS("+proj=tmerc +lat_0=0 +lon_0=27 +k=1 +x_0=3500000 +y_0=0 +ellps=intl +units=m +no_defs"))
It’s a good idea to save the road object at this stage using save
or saveRDS
. It’s a lot faster to load an R object into the workspace than it is to load a shape file!
Next, we will create a SpatialPoints object from the wolf data. A SpatialPoints object is nothing more than a collection of points with an associated coordinate reference system. We then calculate a 100% MCP of the homerange using mcp
from the package adeHabitatHR
.
require(adehabitatHR)
nikiSpatial <- SpatialPoints(niki[c("X","Y")], CRS("+proj=tmerc +lat_0=0 +lon_0=27 +k=1 +x_0=3500000 +y_0=0 +ellps=intl +units=m +no_defs"))
nikiMCP <- mcp(nikiSpatial, 100)
We will crop the road network according to Niki’s MCP using crop
from the package raster
. The crop
function relies on the package rgeos
.
require(raster)
require(rgeos)
roadsInMCP <- crop(kainuuTrans, extent(nikiMCP))
Let’s visualize our data using spplot
from sp
. We must specify the attribute we wish to display (in our case, “TOIMINN_LK”) as an argument to zcol
. Note that spplot
plots in lattice, so we will need the package latticeExtra
to add Niki’s observations to the plot.
require(latticeExtra)
plot(spplot(roadsInMCP, zcol = "TOIMINN_LK") +
layer(panel.points(niki[c("X","Y")], pch = 19, cex = 0.5, col = "black")))
We can see that the observations are clustered around apparent den sites, which in turn are situated relatively far away from priary roads. Since we are interested in primary roads, let’s subset the roadsInMCP
object so that “TOIMINN_LK” takes on values ranging from 1-5.
primaryRoads <- roadsInMCP[roadsInMCP@data$TOIMINN_LK %in% 1:5,]
We will use the package spatstat
to calculate the distance from each observation to the closest primary road. We need to convert the wolf observations and road object into data types that spatstat
can understand. PSP is the spatstat analogue of a SpatialLinesDataFrame object, and PPP is the spatstat analogue of a SpatialPoints object (or, more accurately, a SpatialPointsDataFrame object). We use the package mapTools
to convert the road object into PSP, and we convert the wolf observations into PPP on our own.
require(spatstat)
require(maptools)
primRoadPsp <- as.psp.SpatialLines(primaryRoads, marks = "primary")
nikiPpp <- with(niki, ppp(X,Y, owin(range(X), range(Y))))
plot(primRoadPsp, main = "SpatStat Plot")
plot(nikiPpp, add = T, pch = 19, cex = 0.5)
The spatstat
function project2segment
calculates the distance from each point to the closest line segment. It does so in compiled C code and is very fast.
ds <- project2segment(nikiPpp, primRoadPsp)
distances <- ds$d
head(distances)
## [1] 252.3153 164.4448 1096.3846 1308.4927 1395.8908 1391.9686
In the following plot, we see an illustration of how project2segment
works on a sample of 20 points.
randExample <- sample(1:nrow(niki), 20, FALSE)
plot(primRoadPsp, main = "A few projected points")
plot(nikiPpp[randExample], add = T, pch = 19, cex = 0.5)
arrows(nikiPpp$x[randExample], nikiPpp$y[randExample], ds$Xproj$x[randExample], ds$Xproj$y[randExample], length = 0.05, col = "blue")
If we simply want to determine whether a point lies on the road, we can use buffers. For our purposes, a point lies on the road if it is within 45 m of the road. We will use the function gBuffer
from rgeos
to create a 45 m buffer around the primary roads. The function returns a SpatialPolygon object, which is a collection of polygons with an associated coordinate reference system.
buffer <- gBuffer(primaryRoads, width = 45)
Now, we use the incredibly useful function gIntersection
from rgeos
to find all points that lie inside of the SpatialPolygon buffer. By dividing the number of observations on the road by the total number of observations, we are able to compute the proportion of Niki’s observations that are on primary roads.
pointsOnRoad <- gIntersection(nikiSpatial, buffer)
plot(nikiSpatial, cex = 0.5, pch = 19, col = "grey90")
plot(buffer, border = "red", add = T)
plot(pointsOnRoad, pch = 19, cex = 0.5, add = T)
propOnRoad <- nrow(pointsOnRoad@coords)/nrow(nikiSpatial@coords)
propOnRoad
## [1] 0.007000206
We see that Niki spends roughly 0.007 of the time on primary roads. Notice that we could have arrived at this answer by determining the number of distances that are less than 45 m, then dividing by the total number of distances.
propOnRoadAlternate <- sum(distances <= 45)/length(distances)
propOnRoadAlternate
## [1] 0.007000206
Next, we want to compute the density of primary roads (in km / km^2) within the MCP.
plot(primaryRoads, col = "red")
plot(nikiMCP, add = T, border = "blue")
plot(nikiSpatial, add = T, pch = 19, cex = 0.5)
First, we intersect the road object with the MCP using gIntersection
.
croppedRoads <- gIntersection(nikiMCP, primaryRoads)
plot(croppedRoads, col = "red")
plot(nikiMCP, border = "blue", add = T)
plot(nikiSpatial, add = T, pch = 19, cex = 0.5)
Then, we use functions from rgeos
to calculate total road length (gLength
) and MCP area (gArea
). We arrive at density by dividing length by area.
totalLength <- gLength(croppedRoads)
McpArea <- gArea(nikiMCP)
density <- totalLength/McpArea * 1e3 # convert to units km / km^2
density
## [1] 0.2131982
Again, we will work with the trajectory of Niki. Using raster data, we will 1) extract the habitat type at each of Niki’s positions, and 2) compute the closest house to each location.
First, we need to load the raster into the workspace. In storage, our raster is saved as a tiff file. We will use the function raster
(from the package of the same name) to load the tiff file into the workspace and convert it into an R raster object.
FinlandRaster <- raster("FinlandRaster.tif")
plot(FinlandRaster)
In R, a raster object is basically a super-efficient array. We can index rasters like we can arrays. We can also query the number of rows and columns of a raster, just like we can an array.
FinlandRaster[10,40]
##
## 38
FinlandRaster@ncols
## [1] 7520
FinlandRaster@nrows
## [1] 5554
The values of a raster correspond to different habitat types. In our case, the raster takes on the values 1-44, as well as 255. Roughly, codes 1-11 represent artificial surfaces, codes 12-22 represent agricultural areas, codes 23-34 represent forests, codes 35-44 represent water, and code 255 represents outside of Finland. In particular, codes 1-3 represent houses. This information is available in the file CLC2006_LandClasses.pdf.
We observe that the raster is in the correct coordinate system, so we don’t need to worry about any projection here. If, however, the raster were in the wrong coordinate system, we would use the function projectRaster
from the raster
package.
FinlandRaster@crs
## CRS arguments:
## +proj=tmerc +lat_0=0 +lon_0=27 +k=1 +x_0=3500000 +y_0=0
## +ellps=intl +units=m +no_defs
We crop the raster according to Niki’s trajectory. To do so, we use the function crop
from the package raster
.
bitOfFinland <- crop(FinlandRaster, extent(nikiSpatial))
plot(bitOfFinland)
points(nikiSpatial, col = "black", pch = 19, cex = 0.5)
To extract the habitat type at each observed location, we use the function extract
from the raster
package. Note that we could extract habitat types by carefully indexing the raster object, but that would be much more difficult.
habTypes <- extract(bitOfFinland, niki[c("X","Y")])
table(habTypes)
## habTypes
## 2 4 7 14 15 17 20 21 23 24 25 28 29 30 33
## 2 7 5 1 2 13 1404 1443 380 340 1 47 363 431 3
## 38 43
## 391 23
Niki seems to have spent a lot of time in agricultural zones (the pale yellow in the raster), and very little time in water and human-modified areas (blue and red in the raster, respectively).
To compute the distance from each point to the closest house pixel, we first need to determine the coordinates of each house pixel. We change all raster values ranging from 1-3 (i.e., those corresponding to houses) to TRUE
, and we change all others to FALSE
. We obtain the indeces of the TRUE
cells using the raster
function Which
. Finally, we obtain the actual coordinates of the TRUE
cells using the raster
function xyFromCell
. This process leaves us with the coordinates of the houses. In the plot, wolf locations are black and houses are red.
bitOfFinland[bitOfFinland %in% 1:3] <- TRUE
bitOfFinland[bitOfFinland >= 4] <- FALSE
idx <- Which(bitOfFinland, cells = TRUE)
keyXY <- xyFromCell(bitOfFinland, idx, spatial = TRUE)
plot(nikiSpatial, pch = 19, cex = 0.1, col = "black")
plot(keyXY, pch = 19, cex = 0.1, col = "red", add = T)
We compute the distance between each observation and the closest house using a small chunk of code written by Elie. Converting coordinates into complex numbers is a useful trick for working with 2-dimensional spatial data.
z.wolf <- nikiSpatial@coords[,1] + 1i*nikiSpatial@coords[,2]
z.houses <- keyXY@coords[,1] + 1i * keyXY@coords[,2]
getMinDistance <- function(x,y) {
min(Mod(x-y))
}
ds <- sapply(z.wolf, getMinDistance, y = z.houses)
names(ds) = NULL
head(ds)
## [1] 1610.666 1537.986 2040.159 2074.295 2159.306 2159.044
Originally, I wasted a lot of time attempting to re-invent the wheel by writing many spatial functions on my own. This was time consuming, and my code was often inefficient or simply incorrect. Fortunately, R has a rich ecosystem of spatial functions, methods, and packages. Now, when I need a certain spatial function, I poke around the spatial packages before attempting to write something myself. Typically, I’m able to find a highly optimized function that works quite well.
Here are some additional functions that I find useful.
spsample
from sp
: randomly sample some number of points from a spatial polygon; useful for generating null sets of absences.
rasterToPolygons
from raster
: create a polygon from a raster object; useful for generating polygons that bound various raster features.
%dopar%
from foreach
and doParallel
: perform computations in parallel; useful for completing multiple time-intensive spatial tasks simultaneously, such as loading multiple shape files at once.