Animals live and move in structured, heterogeneous space. Most analyses of movement are not limited to describing the properties of displacement, but ask some questions with respect to spatial variables. Spatial data objects in R can be rather complex - broadly, a landscape can be described in terms of vectors, which include polygons (e.g., - boundaries of a projected area), lines (e.g. roads), points (e.g. GPS locations) or in terms of a raster which is just a grid (usually square) where each cell has a particular value, whether discrete (e.g., a land-cover type) or continuous (e.g., an elevation).
There are quite a few packages that are very helpful for working with raster and spatial data. Here’s a list, in approximate decreasing order of usefulness:
sp
provides the classes for spatial data; makes coordinate transformations simplesf
new package that may eventually replace sp
due to much simpler data structureraster
general-purpose raster functions; reads raster objects into R (very fast)rgdal
reads shape files into Rrgeos
a collection of excellent methods for working with spatial objectsIn this lab, we will:
pcks <- list("sp", "sf", "raster", "rgdal", "rgeos", "ggmap")
sapply(pcks, require, char = TRUE)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE
From Wikipedia: The shapefile format is a popular geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a (mostly) open specification for data interoperability among Esri and other GIS software products. In practice, is is actually several files, all of which have the same name and the following extentions:
.shp
- shape format; the feature geometry itself.shx
- shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly.dbf
- attribute format; columnar attributes for each shape, in dBase IV format.prj
- projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format.(This list is not complete - see the Wikipedia page).
We downloaded the “Perimeters since 1940” file from here the AK Large Fire Database, then clipped the file to the study extent before the workshop (to minimize file size).
sp
packagefires.sp <- readOGR("./_data/fmch_fire_tws.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "./_data/fmch_fire_tws.shp", layer: "fmch_fire_tws"
## with 200 features
## It has 51 fields
## Integer64 fields read as strings: FIREID StrThreat StrBurned
plot(fires.sp, main = "Fires around the Steese National Conservation Area", axes=TRUE, col=alpha(("red"),.5))
Note, fires.sp
is an exotic object called a SpatialPolygonsDataFrame
, which contains a lot of important spatial information. Features of this object include:
head(fires.sp@data)
## FireName FireYear CalcAcres PerimDate PerimTime
## 0 Jagged Ridge 2009 53889.4 2009/08/12 2009/08/12 12:00
## 1 Edwards Creek 2004 278116.0 2008/04/25 2008/04/25 13:00
## 2 Mile 53 Taylor Fire 1953 15228.9 1953/11/01 1953/11/01 14:00
## 3 Porcupine 2003 1195.1 2003/11/01 2003/11/01 16:00
## 4 BEAR CREEK 2011 401.4 2011/06/04 2011/06/04 17:00
## 5 Mississippi 2013 67424.7 2013/10/29 2013/10/29 14:00
## AgencyAcre Method Source SrcMethod SrcClass EditDate
## 0 0 <NA> GPS Helicopter <NA> 2009/08/12
## 1 0 <NA> <NA> <NA> <NA> 2008/04/25
## 2 0 <NA> <NA> <NA> <NA> 2007/11/20
## 3 0 <NA> Digitized Image <NA> 2007/12/03
## 4 0 <NA> GPS FixedWing B - Within 90 Meters 2011/06/05
## 5 0 <NA> GPS FixedWing <NA> 2013/12/10
## Comments
## 0 FOBs flew perimeter of fire closely. Acreage decreased roughly 500 acres.
## 1 Perimeter updated by NPS in 2006
## 2 Perimeter digitized using hand drawn fire map and 1976 landsat image to obtain estimated location fire.
## 3 Perimeter heads-up digitized from 2003 and 2004 Landsat imagery
## 4 Flown by Amy Scraba, digitized by rmmikol
## 5 Added Spot Fire created by wind event, Mike Reggear provided the spot fire perimeter. Edited by Gschmunk
## FIREID Final LATESTPERI Record_ FalseAlarm Complex
## 0 32702 No Yes 497 <NA> Crazy Mountain Complex
## 1 1719 No Yes 234 <NA> Eagle
## 2 27522 No Yes <NA> <NA> <NA>
## 3 1155 No Yes 283 <NA> <NA>
## 4 34986 No Yes 295 <NA> <NA>
## 5 36215 No Yes 117 <NA> <NA>
## AFSFire_ DOFFire_ USFSFire_ AddFire_ Prescribed EstCost EstAcres
## 0 E27V <NA> PDE27V <NA> <NA> 0 53889.4
## 1 A4WG <NA> PDA4WG <NA> <NA> 0 243900.0
## 2 1 <NA> <NA> <NA> <NA> 0 15280.0
## 3 <NA> 311283 <NA> <NA> <NA> 0 640.0
## 4 F4GC <NA> <NA> <NA> <NA> 0 314.0
## 5 HH8N 332117 PDHH8N <NA> <NA> 0 67338.0
## StrThreat StrBurned MgmID MgmOffId MgmOption Latitude Longitude
## 0 0 0 AFS UYD LIMITED 65.54806 -144.1172
## 1 0 0 AFS UYD LIMITED 65.51722 -143.1700
## 2 0 0 <NA> CHICKEN <NA> 64.00000 -142.0000
## 3 0 0 DOF FAS LIMITED 64.71667 -144.7000
## 4 0 0 AFS UYD LIMITED 65.11028 -145.3222
## 5 0 0 AFS MID LIMITED 63.89305 -145.9931
## MapName NearestWX OriginOwn OriginUnit DiscDate DiscSize
## 0 Circle C-1 <NA> STA L&W 2009/07/26 0.5
## 1 Charley River (C-5) <NA> BLM NOD 2004/06/14 3289.0
## 2 <NA> <NA> <NA> <NA> 1953/05/31 0.0
## 3 <NA> <NA> STA L&W 2003/06/14 0.0
## 4 CIRCLE A-3 <NA> BLM NOD 2011/05/31 50.0
## 5 MT HAYES D-4 <NA> MIL Army 2013/05/30 1.0
## IADate IASize CntrlDate OutDate InitBehave PrimFuel
## 0 <NA> 0 <NA> 2009/09/28 <NA> Mixed Trees
## 1 2004/06/16 3289 2004/10/19 2004/10/19 <NA> <NA>
## 2 <NA> 0 <NA> 1953/06/27 <NA> <NA>
## 3 <NA> 0 <NA> <NA> <NA> <NA>
## 4 <NA> 0 2011/06/27 2011/06/27 Creeping <NA>
## 5 <NA> 0 <NA> 2013/12/31 Creeping Black Spruce
## GenCause SpecCause Slope Aspect Elevation AREA LEN
## 0 Lightning Lightning 0-25 North 0501-1500 0.042239175 1.8514115
## 1 Lightning Lightning <NA> <NA> <NA> 0.218502973 9.1513134
## 2 <NA> Smokers <NA> <NA> <NA> 0.011287354 0.5943585
## 3 Lightning <NA> <NA> <NA> <NA> 0.000910141 0.1842077
## 4 Lightning Lightning <NA> <NA> <NA> 0.000310019 0.1124273
## 5 Human Incendiary 0-25 East 1501-2500 0.050106743 2.0510230
fires.sp@proj4string
## CRS arguments:
## +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
fires.sp@bbox
## min max
## x -146.00306 -141.99694
## y 63.99694 66.00306
Accessing all of these data easily can be challenging, and plotting SpatialPolygonsDataFrame
objects outside of sp
is a bit of a mess.
sf
packageInstead of doing this, we’ll use a new package called sf
, short for “simple features”. Rather than a complex structure with slots, spatial objects in sf
are stored as data frames, with the feature geometries stored in list-columns. It’s WAY simpler, and compatible with all packages in the tidyverse
, such as dplyr, ggplot2, and many others.
fires <- st_read("./_data/fmch_fire_tws.shp")
## Reading layer `fmch_fire_tws' from data source `C:\Users\elie\Box Sync\teaching\AlaskaWildlifeSociety\coursewebsite\_data\fmch_fire_tws.shp' using driver `ESRI Shapefile'
## Simple feature collection with 200 features and 51 fields
## geometry type: MULTIPOLYGON
## dimension: XYZ
## bbox: xmin: -146.0031 ymin: 63.99694 xmax: -141.9969 ymax: 66.00306
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
You can see that it gives you a nice printout of summary information here, including the projection.
class(fires)
## [1] "sf" "data.frame"
It functions as both an sf
object and a data.frame
. Handy!
If we want to make a quick ggmap like we did in the “Data Visualization” tutorial, we can do so. The unname
function just removes the names of the bounding box, because the sf
bounding box is a bit different than the sp
one we used earlier.
mymap <- get_map(location = unname(st_bbox(fires)), maptype="hybrid")
ggmap(mymap) +
geom_sf(data = fires, inherit.aes = FALSE, fill = "red", alpha = 0.5)
Some of the polygons overlap, meaning that certain areas have burned more than once since 1940. We’ll have to consider that when we use this layer for analyses.
Here, we load a shape file of roads.
roads <- st_read("./_data/fmch_roads_tws.shp")
str(roads)
## Classes 'sf' and 'data.frame': 897 obs. of 12 variables:
## $ osm_id : Factor w/ 895 levels "107075134","107075146",..: 723 392 487 731 48 100 210 683 267 53 ...
## $ bridge : Factor w/ 1 level "yes": NA NA NA NA NA NA NA NA NA NA ...
## $ highway : Factor w/ 8 levels "footway","path",..: 4 6 6 4 4 6 4 4 6 4 ...
## $ layer : Factor w/ 1 level "1": NA NA NA NA NA NA NA NA NA NA ...
## $ name : Factor w/ 185 levels "49 Alaska Pipeline Road",..: 10 NA NA 166 NA NA NA NA NA NA ...
## $ oneway : Factor w/ 2 levels "no","yes": NA NA NA NA NA NA NA NA NA NA ...
## $ smoothness: Factor w/ 0 levels: NA NA NA NA NA NA NA NA NA NA ...
## $ surface : Factor w/ 6 levels "asphalt","dirt",..: NA NA NA 1 NA 6 NA NA NA NA ...
## $ tunnel : Factor w/ 0 levels: NA NA NA NA NA NA NA NA NA NA ...
## $ width : Factor w/ 0 levels: NA NA NA NA NA NA NA NA NA NA ...
## $ z_index : Factor w/ 5 levels "13","23","27",..: 4 4 4 4 4 4 4 4 4 4 ...
## $ geometry :sfc_MULTILINESTRING of length 897; first list element: List of 1
## ..$ : num [1:5, 1:3] -146 -146 -146 -146 -146 ...
## ..- attr(*, "class")= chr "XYZ" "MULTILINESTRING" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr "osm_id" "bridge" "highway" "layer" ...
You can see the clean structure of an sf
object here.
Now we can load our caribou data (in data.frame
format) on top to see how their locations are distributed in relation to the fires and roads:
load("./_data/caribou.df.rda")
ggmap(mymap) +
geom_sf(data = fires, inherit.aes = FALSE, fill = "red", alpha = 0.5) +
geom_point(data = caribou.df, aes(x = lon, y = lat, col = id)) +
geom_sf(data = roads, inherit.aes = FALSE, col = "white")
Rasters are very efficient spatial matrices. They break up space into equal sized grids and store attributes at each grid point. One common way they are stored is as geoTiff
objects. We downloaded this DEM from the USGS National Elevation Dataset page here. Load rasters into R with the raster
command:
elevation <- raster("./_data/fmch_elev_tws.tif")
plot(elevation)
Our raster has a cell size of 100 x 100 m. In the plot, We can clearly see the mountains and drainages, especially the Yukon River to the north.
We can compare the projections of this elevation raster with the fires, roads, and caribou:
load("./_data/caribou.utm.rda")
projection(caribou.utm)
## [1] "+proj=utm +zone=6 +ellps=WGS84 +datum=WGS84 +units=m +no_defs +towgs84=0,0,0"
projection(elevation)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
st_crs(fires)
## Coordinate Reference System:
## EPSG: 4269
## proj4string: "+proj=longlat +datum=NAD83 +no_defs"
st_crs(roads)
## Coordinate Reference System:
## EPSG: 4326
## proj4string: "+proj=longlat +datum=WGS84 +no_defs"
They are all different! But we can easily reproject everything to the caribou.utm
CRS, which is UTM Zone 6, and visualize everything at once.
fires.utm <- st_transform(fires, projection(caribou.utm))
roads.utm <- st_transform(roads, projection(caribou.utm))
Reproject our elevation raster to UTM Zone 6 with a 100 m resolution. Without specifying the resolution, our the new cells may not be square because we’re reprojecting from decimal degrees to UTM (Warning: potentially very slow step!).
elevation.utm <- projectRaster(elevation, crs = "+proj=utm +zone=6 +ellps=WGS84 +datum=WGS84 +units=m +no_defs", res=100)
res(elevation.utm)
## [1] 100 100
And then plot it all!
plot(elevation.utm, col = grey.colors(100, start = 0))
plot(fires.utm, col = alpha(("red"), 0.5), add = TRUE)
plot(roads.utm, col = "black", add = TRUE, lwd=2)
points(caribou.utm, pch = 19, cex = 0.5, add= TRUE, col=c("red","yellow","darkgreen","blue"))
This is certainly not a work of art (I beg to differ! - EG), but it allows us to see everything together on one map.
We are interested in the time since last fire. In the fires
dataset, there’s are fields for DiscDate
and OutDate
. Since lots of the fires burned for months, we are more interested in the OutDate
field. The first thing to do is convert these values to dates, because they’re currently factors. We can use lubridate here.
fires.utm$startDate <- lubridate::ymd(fires.utm$DiscDate)
fires.utm$endDate <- lubridate::ymd(fires.utm$OutDate)
sum(is.na(fires.utm$endDate))
## [1] 9
Unfortunately, 9 of the 200 fires have ‘NA’ values for “outdate”. For those records, we will just have to use the discovery date for the fire because we don’t actually know when it went out.
This command, using the pmax
function, will remove NAs and give us what we want.
fires.utm$fireDate <- pmax(fires.utm$startDate, fires.utm$endDate, na.rm = TRUE)
Because all of our locations are from August and September 2017, we can find the years since fire by calculating the time difference, in years, between August 2017 and the fire date.
fires.utm$tsf <- as.numeric(difftime(lubridate::ymd("2017-08-10"), fires.utm$fireDate)/365.25)
Unfortunately, the sf
package doesn’t play nicely with rasters yet, so we’ll have to convert those to sp
objects before rasterizing. One required (and annoying) step is to drop the “z” dimension field that the sf
package uses but sp
doesn’t understand.
roads.utm <- st_zm(roads.utm)
fires.utm <- st_zm(fires.utm)
roads.sp <- as(roads.utm, "Spatial")
fires.sp <- as(fires.utm, "Spatial")
Create an empty raster and set extent to projection and resolution to that of the elevation.utm
layer.
empty.fire.raster <- raster()
extent(empty.fire.raster) <- extent(elevation.utm)
projection(empty.fire.raster) <- projection(elevation.utm)
res(empty.fire.raster) <- 100
As we mentioned earlier, some of the fires overlap. We want each cell to have the minimum, or most recent fire, so we’ll specify fun=min
here within the rasterize
function. This might take some time:
fire.raster <- rasterize(fires.sp, empty.fire.raster, field="tsf", fun=min)
For this exercise, we’re just going to assign areas that haven’t burned since 1940 (which is the beginning of the fire dataset) to have a time since fire of 100 years, because we don’t want to leave them as NA. So we’ll assign the NAs in the raster to 100.
fire.raster[is.na(fire.raster[])] <- 100
plot(fire.raster)
Now we just need to rasterize the roads.sp
layer and create a raster of distance to roads. We can create another empty raster and fill it with NAs, which the distance
function prefers. The last line below takes a while.
empty.roads.raster <- raster()
extent(empty.roads.raster) <- extent(elevation.utm)
projection(empty.roads.raster) <- projection(elevation.utm)
res(empty.roads.raster) <- 100
empty.roads.raster[] <- NA
roads.raster <- rasterize(roads.sp, empty.roads.raster, 1)
This step takes far too long, so please don’t run it. We have a file already made for you.
dist.roads <- distance(roads.raster)
save(dist.roads, file= "./_data/dist.roads.rda")
Here’s the file:
load("./_data/dist.roads.rda")
plot(dist.roads)
For some analyses, such as resource selection functions, it’s helpful to have a RasterStack
- which is a bunch of rasters (of the same “extent”) stacked on top of each other. Since all our rasters now have the same extent and projection, this is easy to do. We’ll rename the raster layers while we’re at it.
fmch.habitat <- stack(elevation.utm, fire.raster, dist.roads)
names(fmch.habitat) <- c("elevation", "time.since.fire", "dist.roads")
plot(fmch.habitat, main=c("Elevation (m)", "Time since fire (years)", "Distance to road (m)"))
For efficiency (and to save the need to replicate a lot of code), you can save all of the correctly transformed habitat covariates in a single file which will be easy to open as needed.
save(fmch.habitat, file = "./_data/FMCH.habitat.rda")