SlideShare a Scribd company logo
R package gdistance: distances and routes on
geographical grids (version 1.1-2)
Jacob van Etten
September 28, 2011
1 Introduction
This vignette describes gdistance, an R package which provides func-
tionality to calculate various distance measures and routes in heterogeneous
geographic spaces represented as grids. Distances are fundamental to geospa-
tial analysis (Tobler 1970). The most commonly used geographic distance
measure is the great-circle distance, which represents the shortest line be-
tween two points, taking into account the curvature of the earth. However,
the great-circle distance does not correspond very well to expected travel
time/effort between two points. Travel time and the real distance travelled
depend on the means of transport, the mode of route-finding, and the char-
acteristics of landscapes and infrastructure. The great-circle distance could
be considered as referring to a special case: goal-directed movement with
no obstacles, ‘as the crow flies’. Other distance measures are needed when
travel is not (or less) goal-directed and landscape characteristics affect move-
ment in a spatially heterogeneous way. Package gdistance was created to
calculate distances and determine routes using geographical grids (rasters)
to represent landscape heterogeneity. It provides the following distance and
route calculations.
• The least-cost distance mimics route finding ‘as the fox runs’, taking
into account obstacles and the local ‘friction’ of the landscape.
• A second type of route-finding is the random walk, which has no prede-
termined destination (‘drunkard’s walk’). Resistance distance reflects
the travel time from origin to goal of the average (Brownian) random
walk (McRae 2006).
1
2 gdistance: geographic distances and routes
• ‘Randomised shortest paths’ are an intermediate form between shortest
paths and Brownian random walks, recently introduced by Saerens et
al. (2009).
The functionality of gdistance corresponds to other software like ArcGIS
Spatial Analyst, GRASS GIS (r.cost, r.walk functions), and CircuitScape
(random walk / resistance distance). The gdistance package also contains
specific functionality for geographical genetic analyses. The package imple-
ments measures to model dispersal histories first presented by Van Etten and
Hijmans (2010). Section 9 below introduces with an example how gdistance
can be used in geographical genetics.
Package gdistance uses functionality from a number of other R packages.
The most important among these packages is raster. To use gdistance and
to understand the details of this vignette, the reader has to be familiar with
the basic functionality of raster.
2 Transition* classes
To make distance calculations as flexible as possible, distances and other
measures are calculated in various steps. The central classes in gdistance
are the S4 classes TransitionLayer and TransitionStack. Most operations
have an object of one of these classes either as input or output.
Transition* objects can be constructed from an object of class RasterLayer,
RasterStack or RasterBrick. These classes are from raster, a memory-
efficient and user-friendly R package which contains complete geographical
grid functionality. The class Transition* takes the necessary geographic ref-
erences (projection, resolution, extent) from the original Raster* object. It
also contains a matrix which specifies the probability of movements between
cells or, in more general terms, the ‘conductance’ of inter-cell connections.
Each row and column in the matrix represents a cell in the original Raster*
object. Row/column 1 in the transition matrix corresponds to cell 1 in the
original raster, and so on. Cell numbers in rasters go from left to right and
from top to bottom. For instance, a 3 x 3 raster would have the following
cell numbers:
1 2 3
4 5 6
7 8 9
This raster would produce a 9 x 9 transition matrix with rows/columns num-
bered from 1 to 9.
Using conductance values may be a bit confusing at first. Other soft-
ware generally uses friction surfaces to calculate distances. Also, distances
Jacob van Etten 3
are calculated in terms of accumulated friction or cost. 1
However, the re-
lation between conductance and friction is straightforward: conductance is
the reciprocal of friction (1/friction). It is not strange to use the word ‘con-
ductance’ in this context (or to use resistance as a synonym for friction).
There is an analogy between random walks on geographical grids and elec-
trical current in a mesh of resistors (McRae et al. 2008). Calculations of
‘resistance distance’ (see below) take advantage of this analogy. Another ad-
vantage of using conductance is that it makes it possible to store the values
very efficiently as a so-called sparse matrix. Sparse matrices only record the
non-zero values and information about their location in the matrix. In most
cases, cells are connected only with adjacent cells. Consequently, a conduc-
tance matrix contains only a small fraction of non-zero values, which occupy
little memory in a sparse matrix format. The package gdistance makes use
of sparse matrix classes and methods from the package Matrix, which gives
access to fast procedures implemented in the C language.
A first step in any analysis with gdistance is the construction of an object
of the class Transition*. The construction of a Transition* object from
a Raster* object is straightforward. We can define an arbitrary function
to calculate the conductance values from the values of each pair of cells to
be connected. Here, we create a raster with 10 degree cells and set it cells
to random values between 0.4 and 0.6. We then create a TransitionLayer
object. The transition value between each pair of cells is the mean of the two
cell values.
> library(gdistance)
raster version 1.9-19 (22-September-2011)
> r <- raster(nrows = 18, ncols = 36)
> r <- setValues(r, runif(ncell(r), min = 0.4, max = 0.6))
> r
class : RasterLayer
dimensions : 18, 36, 648 (nrow, ncol, ncell)
resolution : 10, 10 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84
values : in memory
min value : 0.4000163
max value : 0.5995257
1
‘Permeability’ is a synonym of conductance. Impedance, resistance, cost and friction
are used interchangeably to denote the contrary.
4 gdistance: geographic distances and routes
> tr1 <- transition(r, transitionFunction = mean,
+ directions = 8)
We set the directions argument to value 8. This connects all adjacent
cells in 8 directions. Cells can also be connected in 4 or 16 connections.
In chess terms, setting directions to 4 connects all cells with all possible
one-cell rook movements (producing ‘Manhattan’ distances), while setting
directions to 8 connects with one-cell queen movements. With 16 directions,
all cells are connected with both one-cell queen movements and one-turn
knight movements. This can make distance calculations more accurate.2
If we inspect the object we created, we see that the resulting TransitionLayer
object keeps much information from the original RasterLayer object.
> tr1
class : TransitionLayer
dimensions : 18, 36, 648 (nrow, ncol, ncell)
resolution : 10, 10 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84
values : conductance
matrix class: dsCMatrix
It is also possible to create asymmetric matrices, in which the conductance
from i to j is not always the same as the conductance from j back to i.
This is relevant, among other things, for modelling travel in hilly terrain,
as shown in Example 1 below. On the same slope, a downslope traveler
experiences less resistance than an upslope traveler. In this case, the function
to calculate conductance values is non-commutative: f(i, j) = f(j, i). The
symm argument in transition needs to be set to FALSE.
> ncf <- function(x) max(x) - x[1] + x[2]
> tr2 <- transition(r, ncf, 4, symm = FALSE)
> class(transitionMatrix(tr1))
[1] "dsCMatrix"
attr(,"package")
[1] "Matrix"
2
Connecting in 16 directions was inspired by the function r.cost in GRASS 6, and the
documentation of this function illustrates nicely why connecting in 16 directions can in-
crease the accuracy of the calculations http://grass.itc.it/grass64/manuals/html64_
user/r.cost.html. Also, see the section on distance transforms in de Smith et al. (2009).
Jacob van Etten 5
> class(transitionMatrix(tr2))
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
The sparse matrix class dsCMatrix is symmetric and contains only half of
the matrix. The class dgCMatrix can hold an asymmetric matrix. Different
mathematical operations can be done with Transition* objects. This makes
it possible to flexibly model different components of landscape friction.
> tr3 <- tr1 * tr2
> tr3 <- tr1 + tr2
> tr3 <- tr1 * 3
> tr3 <- sqrt(tr1)
Operations with more than one object require that the different objects have
the same resolution and extent.
Also, it is possible to extract and replace values in the matrix using in-
dices.
> tr3[cbind(1:9, 1:9)] <- tr2[cbind(1:9, 1:9)]
> tr3[1:9, 1:9] <- tr2[1:9, 1:9]
> tr3[1:5, 1:5]
5 x 5 sparse Matrix of class "dgCMatrix"
[1,] . 0.5833925 . . .
[2,] 0.478976 . 0.6168308 . .
[3,] . 0.5311842 . 0.4703616 .
[4,] . . 0.6776534 . 0.5776266
[5,] . . . 0.4703616 .
Some functions require that Transition* objects do not contain any
isolated ‘clumps’. This can be avoided when creating Transition* objects,
for instance by giving conductance values between all adjacent cells some
minimum value. Also, it can be checked visually. Here are a few ways
to visualize a Transition* object. You can extract the transition matrix
with function transitionMatrix. This gives a sparse matrix which can be
vizualized with function image. This shows the rows and columns of the
transition matrix and indicates which has a non-zero value (“connection”) as
a black dot (Figure 1).
6 gdistance: geographic distances and routes
> image(transitionMatrix(tr1))
Dimensions: 648 x 648
Column
Row
100
200
300
400
500
600
100 200 300 400 500 600
Figure 1: Visualizing a TransitionLayer with function image()
In Figure 1, the central diagonal line represents cell connections between
raster cells in the same row. The two adjacent lines represent cells in the
same column in the raster, separated by the length of the row in the raster.
The curious dots around these lines have to do with the special character
of the original raster. Since this raster covers the whole world, the outer
meridians touch each other. The software takes this into account and as a
result the cells in the extreme left column are connected to the extreme right
column! The diagonal connections between the extreme columns explain the
isolated dots in Figure 1.
Figure 1 shows which cells contain non-zero values, but gives no further
information about levels of conductance. However, we can transform the
transition matrix back in to a raster to visualize this. To summarize the
information in transition matrix, we can take means or sums across rows
or columns, for instance. You can do this with function raster. Applied
to a TransitionLayer, this function converts it to a RasterLayer. For
the different options see method?raster("TransitionLayer"). The default,
shown in Figure 2, takes the column-wise means of the non-zero values. All
these forms of transformation unavoidably cause information loss, of course.
Jacob van Etten 7
> plot(raster(tr1))
−150 −100 −50 0 50 100 150
−50050
0.44
0.46
0.48
0.50
0.52
0.54
0.56
Figure 2: Visualizing a TransitionLayer using the function raster()
3 Correcting inter-cell conductance values
The function transition calculates transition values based on the val-
ues of adjacent cells in the input raster. However, the centres of diagonally
connected cells are more remote from each other than in the case of orthogo-
nally connected cells. Secondly, on equirectangular (lonlat) projection grids,
W-E connections are longer at the equator and become shorter towards the
poles. Therefore, the values in the matrix need to be corrected for these two
types of distortion. Both types of distortion can be corrected by dividing
each conductance matrix value between the inter-cell distance. This is what
function geoCorrection does for us.
> tr1CorrC <- geoCorrection(tr1, type = "c", multpl = FALSE)
> tr2CorrC <- geoCorrection(tr2, type = "c", multpl = FALSE)
For random walks on longlat grids, there is an additional consideration to
be made. The number of connections in N-S direction remains equal when
moving from the equator to the poles. This is problematic, because random
walks can be seen as analogous to electrical current through a networks of
resistors. The inter-cell connections should be thought of as parallel resistors.
Moving away from the equator, the inter-meridian space each individual re-
sistor bridges becomes narrower, tending to zero at the poles. Therefore, the
8 gdistance: geographic distances and routes
N-S resistance between parallels should decrease when moving away from the
equator. The function geoCorrection corrects this distortion by multiplying
the N-S transition values with the cosine of the average latitude of the cell
centres. This is done with function geoCorrection, by setting the argument
type to “r”,
> tr1CorrR <- geoCorrection(tr1, type = "r", multpl = FALSE)
When similar Transition* objects with equal resolution and extent need
to be corrected repetitively, computational effort may be reduced by prepar-
ing an object that only needs to be multiplied with the Transition* object to
produce a corrected version of it. The following is equivalent to the previous
procedure.
> tr1CorrMatrix <- geoCorrection(tr1, type = "r",
+ multpl = TRUE)
> tr1CorrR <- tr1 * tr1CorrMatrix
Object trCorr1Matrix is only calculated once. It can be multiplied with
Transition* objects, as long as they have the same extent, resolution, and
directions of cell connections. We need to take special care that the geo-
correction multiplication matrix (tr1CorrMatrix) contains all non-zero val-
ues that are present in the Transition* object with which it will be multi-
plied (tr1).3
4 Calculating distances
With the corrected Transition* object we can calculate distances be-
tween points. It is important to note that all distance functions require a
Transition* object with conductance values, even though distances will be
expressed in 1/conductance (friction or resistance) units.
To calculate distances, we need to have the coordinates of point locations.
This is done by creating a two-row matrix of coordinates. Functions will also
accept a SpatialPoints object or, if there is only one point, a vector of
length two.
> sP <- cbind(c(65, 5, -65), c(55, 35, -35))
Calculating a distance matrix is straightforward now.
3
A good alternative is to use geoCorrection(mulpl=FALSE) with a Transition* object
with cells connected with value 1.
Jacob van Etten 9
> costDistance(tr1CorrC, sP)
1 2
2 9805972
3 31745174 22335047
> costDistance(tr2CorrC, sP)
[,1] [,2] [,3]
[1,] 0 11415775 34951321
[2,] 10816497 0 26753977
[3,] 33813271 26327784 0
> resistanceDistance(tr1CorrR, sP)
1 2
2 3012.742
3 3599.843 3317.169
5 Dispersal paths
To determine dispersal paths with a random element, we use the function
passage. This function can be used for both random walks and randomised
shortest paths. The function calculates the number of passages through cells
before arriving in the destination cell. Either the total or net number of
passages can be calculated. The net number of passages is the number of
passages that are not reciprocated by a passage in the opposite direction.
Figure 3 shows the probability of passage through each cell, assuming
randomised shortest paths with the parameter theta set to 2. Here we see
again how the package takes into account that we are dealing with a raster
that covers the whole world. The route crosses the 180 degrees meridian
without any problem.
10 gdistance: geographic distances and routes
> origin <- SpatialPoints(cbind(105, -55))
> goal <- SpatialPoints(cbind(-105, 55))
> rSPraster <- passage(tr1, origin, goal, theta = 2)
> plot(rSPraster)
−150 −100 −50 0 50 100 150
−50050
0.0
0.2
0.4
0.6
0.8
1.0
Figure 3: Probability of passage
6 Path overlap and non-overlap
One of the specific uses, for which package gdistance was created, is to
look at trajectories coming from the same source (van Etten and Hijmans
2010).
The degree of coincidence of two trajectories can be visualized by multi-
plying the probabilities of passage.
> r1 <- passage(tr1, origin, sP[1, ], theta = 2)
> r2 <- passage(tr1, origin, sP[2, ], theta = 2)
> rJoint <- r1 * r2
Jacob van Etten 11
> plot(rJoint)
−150 −100 −50 0 50 100 150
−50050
0.0
0.2
0.4
0.6
0.8
1.0
Figure 4: Overlapping part of the two routes
> rDiv <- max(max(r1, r2) * (1 - min(r1, r2)) -
+ min(r1, r2), 0)
With the function pathInc() we can calculate measures of path overlap
and non-overlap for a large number of points. These measures can be used
to predict patterns of diversity if these are due to dispersal from a single
common source (van Etten and Hijmans 2010). If the argument type contains
two elements (divergent and joint), the result is a list of distances matrices.
> pathInc(tr1CorrC, sP[1, ], sP[2:3, ], type = c("divergent",
+ "joint"))
$divergent
1
2 36010415
$joint
1
2 583857.4
12 gdistance: geographic distances and routes
> plot(rDiv)
−150 −100 −50 0 50 100 150
−50050
0.0
0.2
0.4
0.6
0.8
1.0
Figure 5: Non-overlapping part of the two routes
7 Example 1: Hiking around Maunga Whau
The previous examples were somewhat theoretical, based on randomly
generated values. More realistic examples serve to illustrate the various uses
that can be given to this package.
Determining the fastest route between two points in complex terrain is
useful for hikers. Tobler’s Hiking Function provides a rough estimate for the
the maximum hiking speed given the slope of the terrain (Tobler 1993). The
maximum speed of off-path hiking (in m/s) is:
speed = exp(-3.5 * abs(slope + 0.05))
Note that the function is not symmetric around 0 (see Figure 6).
Jacob van Etten 13
−1.0 −0.5 0.0 0.5 1.0
0.00.20.40.60.81.0
slope
speed(m/s)
Figure 6: Tobler’s Hiking Function
We use the Hiking Function to determine the shortest path to hike around
the volcano Maunga Whau (Auckland, New Zealand). First, we read in the
altitude data for the volcano. This is a geo-referenced version of the“volcano”
data available in Base R datasets (see ?volcano for more information).
> r <- raster(system.file("external/maungawhau.grd",
+ package = "gdistance"))
The Hiking Function requires the slope as input.
slope = difference in height / distance travelled
The units of height and distance should be identical. Here, we use meters
for both. We calculate the height differences between cells first. Then we use
the function geoCorrection() to divide by the distance between cells.
> heightDiff <- function(x){x[2] - x[1]}
> hd <- transition(r,heightDiff,8,symm=FALSE)
> slope <- geoCorrection(hd, scl=FALSE)
Subsequently, we calculate the speed. We need to exercise special care, be-
cause the matrix values between non-adjacent cells is 0, but the slope between
14 gdistance: geographic distances and routes
these cells is not 0! Therefore, we need to restrict the calculation to adjacent
cells. We do this by creating an index for adjacent cells (adj) with the func-
tion adjacency(). Using this index, we extract and replace adjacent cells,
without touching the other values.
> adj <- adjacency(x = r, fromCells = 1:ncell(r),
+ toCells = 1:ncell(r), directions = 8)
> speed <- slope
> speed[adj] <- exp(-3.5 * abs(slope[adj] + 0.05))
Now we have calculated the speed of movement between adjacent cells. We
are close to having the final conductance values. Attainable speed is a mea-
sure of the ease of crossing from one cell to another on the grid. However, we
also need to take into account the distance between cell centres. Travelling
with the same speed, a diagonal connection between cells takes longer to cross
than a straight connection. Therefore, we use the function geoCorrection()
again!
> x <- geoCorrection(speed, scl = FALSE)
This gives our final ”conductance” values.
What do these ”conductance”values mean? The function geoCorrection()
divides the values in the matrix with the distance between cell centres. So,
with our last command we calculated this:
conductance = speed / distance
This looks a lot like a measure that we are more familiar with:
travel time = distance / speed
In fact, the conductance values we have calculated are the reciprocal of travel
time.
1 / travel time = speed / distance = conductance
Maximizing the reciprocal of travel time is exactly equivalent to minimizing
travel time!
Now we define two coordinates, A and B, and determine the paths be-
tween them. We test if the quickest path from A to B is the same as the
quickest path from B back to A.
Jacob van Etten 15
2667400 2667600 2667800 2668000
6478800647900064792006479400
100
120
140
160
180
A
B
Figure 7: Quickest hiking routes around Maunga Whau
> A <- c(2667670, 6479000)
> B <- c(2667800, 6479400)
> AtoB <- shortestPath(x, A, B, output = "SpatialLines")
> BtoA <- shortestPath(x, B, A, output = "SpatialLines")
> plot(r)
> lines(AtoB, col = "red", lwd = 2)
> lines(BtoA, col = "blue")
> text(A[1] - 10, A[2] - 10, "A")
> text(B[1] + 10, B[2] + 10, "B")
A small part of the A-B (red) and B-A (blue) lines in the figure do not
overlap. This is a consequence of the asymmetry of the Hiking Function.
16 gdistance: geographic distances and routes
8 Example 2: Geographical genetics
The direct relation between genetic and geographic distances is known as
isolation by distance (Wright 1943). Recent work has expanded this relation-
ship to random movement in heterogeneous landscapes (McRae 2006). Also,
the geography of dispersal routes can explain observed geospatial patterns
of genetic diversity. For instance, diffusion from a single origin (Africa) ex-
plains much of the current geographical patterns of human genetic diversity
(Ramachandran 2005). As a result, the mutual genetic distance between a
pair of humans from different parts from the globe depends on the extent
they share their prehistoric migration history.
Within a single continent, however, human genetic diversity may have to
do with more recent events. Let’s look at diversity in Europe, using the data
presented by Balaresque et al. (2010). Within Europe, genetic diversity is
often thought to be a result of the migration of early Neolithic farmers from
Anatolia (Turkey) to the west.
First we read in the data, including the coordinates of the populations
(Figure 8) and mutual genetic distances.
> Europe <- raster(system.file("external/Europe.grd",
+ package = "gdistance"))
> Europe[is.na(Europe)] <- 0
> data(genDist)
> data(popCoord)
> pC <- as.matrix(popCoord[c("x", "y")])
Then we create three geographical distance matrices. The first corre-
sponds to the great-circle distance between populations. The second is the
least-cost distance between locations. Travel is restricted to the land mass.
The third is the resistance distance (using the same conductance matrix),
which is related to the random-walk travel time between points (McRae
2006).
> geoDist <- pointDistance(pC, longlat = TRUE)
> geoDist <- as.dist(geoDist)
> Europe <- aggregate(Europe, 3)
> tr <- transition(Europe, mean, directions = 8)
> tr <- geoCorrection(tr, "c")
> tR <- geoCorrection(tr, "r")
> cosDist <- costDistance(tr, pC)
> resDist <- resistanceDistance(tR, pC)
> cor(genDist, geoDist)
Jacob van Etten 17
−20 −10 0 10 20 30 40
3040506070
0.0
0.2
0.4
0.6
0.8
1.0
DK
EN1
EN2
FR1
FR2
FR3
FR4
FR7
GE
GE1
GR
IR
IT1
IT2
NL
SB
SP2
SP4
SP5
SP6
TK1 TK2TK3
Figure 8: Map of genotyped populations
18 gdistance: geographic distances and routes
[1] 0.5962655
> cor(genDist, cosDist)
[1] 0.5889319
> cor(genDist, resDist)
[1] -0.05532471
Interestingly, the great-circle distance between points turns out to be the
best predictor of genetic distance. The other distance measures incorporate
more information about the geographic space in which geneflow takes place,
but do not improve the prediction. But how well does a wave of expansion
from Anatolia explain the spatial pattern?
> origin <- unlist(popCoord[22, c("x", "y")])
> pI <- pathInc(tr, origin = origin, fromCoords = pC,
+ type = "joint", theta = 2)
> cor(genDist, pI)
[1] NA
At least at first sight, the overlap of dispersal routes explain the spatial pat-
tern better than any of the previous measures. The negative sign of the
last correlation coefficient was expected, as more overlap in routes is associ-
ated with lower genetic distance. While additional work would be needed to
improve predictions and compare the different models more rigorously, the
promise of dispersal modelling with gdistance is clear.
9 Final remarks
Questions about the use of gdistance can be posted on the r-sig-geo email
list. Bug reports and requests for additional functionality can be mailed to
jacobvanetten@yahoo.com.
10 References
Balaresque P., et al. 2010. A predominantly Neolithic origin for European
paternal lineages. PLoS Biology 8(1): e1000285.
Jacob van Etten 19
de Smith, M.J., M.F. Goodchild, and P.A. Longley. 2009. Geospatial Anal-
ysis. Matador. 3rd edition.
McRae B.H. 2006. Isolation by resistance. Evolution 60: 1551–1561.
McRae B.H., B.G. Dickson, and T. Keitt. 2008. Using circuit theory to
model connectivity in ecology, evolution, and conservation. Ecology
89:2712-2724.
Ramachandran S., et al. 2005. Support from the relationship of genetic and
geographic distance in human populations for a serial founder effect
originating in Africa. PNAS 102: 15942–15947.
Saerens M., L. Yen, F. Fouss, and Y. Achbany. 2009. Randomized shortest-
path problems: two related models. Neural Computation, 21(8):2363-
2404.
Tobler W. 1970. A computer movie simulating urban growth in the Detroit
region. Economic Geography, 46(2): 234-240.
Tobler W. 1993. Three Presentations on Geographical Analysis and Mod-
eling. http://www.ncgia.ucsb.edu/Publications/Tech_Reports/
93/93-1.PDF
van Etten, J., and R.J. Hijmans. 2010. A geospatial modelling approach in-
tegrating archaeobotany and genetics to trace the origin and dispersal
of domesticated plants. PLoS ONE 5(8): e12060.
Wright, S. 1943. Isolation by distance. Genetics 28: 114–138.

More Related Content

PDF
Comparative Report Ed098
PDF
A High Order Continuation Based On Time Power Series Expansion And Time Ratio...
PDF
Recurrent and Recursive Networks (Part 1)
PDF
Fractal Analyzer
PDF
Recurrent and Recursive Nets (part 2)
PDF
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...
PDF
Non-life claims reserves using Dirichlet random environment
PDF
The Chimera Grid Concept and Application
Comparative Report Ed098
A High Order Continuation Based On Time Power Series Expansion And Time Ratio...
Recurrent and Recursive Networks (Part 1)
Fractal Analyzer
Recurrent and Recursive Nets (part 2)
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...
Non-life claims reserves using Dirichlet random environment
The Chimera Grid Concept and Application

What's hot (20)

PDF
Performance evaluation of ds cdma
PDF
Jgrass-NewAge: Kriging component
PDF
first research paper
PDF
A comparison of efficient algorithms for scheduling parallel data redistribution
PDF
Dce a novel delay correlation
PDF
A novel architecture of rns based
PDF
Ee35741744
PDF
07 mesh
PPT
Fluent and Gambit Workshop
PPTX
Clique and sting
PDF
On the solution of incompressible fluid flow equations
PDF
11.on the solution of incompressible fluid flow equations
PDF
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
PDF
PPTX
Tensor Spectral Clustering
PDF
Novel Logic Circuits Dynamic Parameters Analysis
PPTX
Clique
PDF
Dynamics
PDF
2005 pinto santiago_marchi_cobem_2005
PPT
2D CFD Code Based on MATLAB- As Good As FLUENT!
Performance evaluation of ds cdma
Jgrass-NewAge: Kriging component
first research paper
A comparison of efficient algorithms for scheduling parallel data redistribution
Dce a novel delay correlation
A novel architecture of rns based
Ee35741744
07 mesh
Fluent and Gambit Workshop
Clique and sting
On the solution of incompressible fluid flow equations
11.on the solution of incompressible fluid flow equations
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
Tensor Spectral Clustering
Novel Logic Circuits Dynamic Parameters Analysis
Clique
Dynamics
2005 pinto santiago_marchi_cobem_2005
2D CFD Code Based on MATLAB- As Good As FLUENT!
Ad

Viewers also liked (13)

PPTX
Kleding bedrukken Arnhem
PDF
Hipotecas a tipo mixto: ¿buena opción?
PDF
Post. Dip Business Management
PPT
Zoe's Giant pandas-Power Point
PPTX
Colaborativo tres catedra
PDF
Proposed hb 622 cd1
DOCX
Email from Pamela Taylor
PDF
Resolucion 48 2016 dian
PDF
Pbmsg311 tcl 120 vac rev a
PDF
NIST Server Hardening Guide SP 800-123
PPTX
IBM Modern Analytics Journey
PDF
Cloud Computing Security Seminar
PDF
NIST Cloud Computing Security Reference Architecture 800-299
Kleding bedrukken Arnhem
Hipotecas a tipo mixto: ¿buena opción?
Post. Dip Business Management
Zoe's Giant pandas-Power Point
Colaborativo tres catedra
Proposed hb 622 cd1
Email from Pamela Taylor
Resolucion 48 2016 dian
Pbmsg311 tcl 120 vac rev a
NIST Server Hardening Guide SP 800-123
IBM Modern Analytics Journey
Cloud Computing Security Seminar
NIST Cloud Computing Security Reference Architecture 800-299
Ad

Similar to Gdistance vignette (20)

PDF
Working with spatial trajectories in Boost Geometry
PDF
Words in Space - Rebecca Bilbro
PDF
A Visual Exploration of Distance, Documents, and Distributions
ODP
Geospatial Data in R
PPT
PoGIS 6_2_Spatial Dataanalysis.ppt
PDF
Words in space
PPT
Gis Concepts 5/5
PDF
ClusterAnalysis
PDF
ML_Unit_IV_Clustering in Machine Learning.pdf
PPT
Applications of graphs
PDF
Spatial Analysis with R - the Good, the Bad, and the Pretty
PDF
spatial analysis introduction and vector type of data.pdf
PDF
Shortest route finding using an object oriented database approach
PDF
Scattered gis handbook
PPTX
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
PDF
Chapter2 NEAREST NEIGHBOURHOOD ALGORITHMS.pdf
PPT
Digital Distance Geometry
PPT
Cs345 cl
PDF
Informs 2019 - Flexible Network Design Utilizing Non Strict Modeling Approaches
PPT
Where Next
Working with spatial trajectories in Boost Geometry
Words in Space - Rebecca Bilbro
A Visual Exploration of Distance, Documents, and Distributions
Geospatial Data in R
PoGIS 6_2_Spatial Dataanalysis.ppt
Words in space
Gis Concepts 5/5
ClusterAnalysis
ML_Unit_IV_Clustering in Machine Learning.pdf
Applications of graphs
Spatial Analysis with R - the Good, the Bad, and the Pretty
spatial analysis introduction and vector type of data.pdf
Shortest route finding using an object oriented database approach
Scattered gis handbook
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Chapter2 NEAREST NEIGHBOURHOOD ALGORITHMS.pdf
Digital Distance Geometry
Cs345 cl
Informs 2019 - Flexible Network Design Utilizing Non Strict Modeling Approaches
Where Next

Recently uploaded (20)

PPT
Total quality management ppt for engineering students
PDF
August 2025 - Top 10 Read Articles in Network Security & Its Applications
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PPTX
Module 8- Technological and Communication Skills.pptx
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPTX
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
PDF
Abrasive, erosive and cavitation wear.pdf
PDF
Design Guidelines and solutions for Plastics parts
PPTX
Current and future trends in Computer Vision.pptx
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PDF
ChapteR012372321DFGDSFGDFGDFSGDFGDFGDFGSDFGDFGFD
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
Visual Aids for Exploratory Data Analysis.pdf
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Total quality management ppt for engineering students
August 2025 - Top 10 Read Articles in Network Security & Its Applications
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Module 8- Technological and Communication Skills.pptx
Fundamentals of safety and accident prevention -final (1).pptx
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
Abrasive, erosive and cavitation wear.pdf
Design Guidelines and solutions for Plastics parts
Current and future trends in Computer Vision.pptx
III.4.1.2_The_Space_Environment.p pdffdf
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
ChapteR012372321DFGDSFGDFGDFSGDFGDFGDFGSDFGDFGFD
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Information Storage and Retrieval Techniques Unit III
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
Visual Aids for Exploratory Data Analysis.pdf
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf

Gdistance vignette

  • 1. R package gdistance: distances and routes on geographical grids (version 1.1-2) Jacob van Etten September 28, 2011 1 Introduction This vignette describes gdistance, an R package which provides func- tionality to calculate various distance measures and routes in heterogeneous geographic spaces represented as grids. Distances are fundamental to geospa- tial analysis (Tobler 1970). The most commonly used geographic distance measure is the great-circle distance, which represents the shortest line be- tween two points, taking into account the curvature of the earth. However, the great-circle distance does not correspond very well to expected travel time/effort between two points. Travel time and the real distance travelled depend on the means of transport, the mode of route-finding, and the char- acteristics of landscapes and infrastructure. The great-circle distance could be considered as referring to a special case: goal-directed movement with no obstacles, ‘as the crow flies’. Other distance measures are needed when travel is not (or less) goal-directed and landscape characteristics affect move- ment in a spatially heterogeneous way. Package gdistance was created to calculate distances and determine routes using geographical grids (rasters) to represent landscape heterogeneity. It provides the following distance and route calculations. • The least-cost distance mimics route finding ‘as the fox runs’, taking into account obstacles and the local ‘friction’ of the landscape. • A second type of route-finding is the random walk, which has no prede- termined destination (‘drunkard’s walk’). Resistance distance reflects the travel time from origin to goal of the average (Brownian) random walk (McRae 2006). 1
  • 2. 2 gdistance: geographic distances and routes • ‘Randomised shortest paths’ are an intermediate form between shortest paths and Brownian random walks, recently introduced by Saerens et al. (2009). The functionality of gdistance corresponds to other software like ArcGIS Spatial Analyst, GRASS GIS (r.cost, r.walk functions), and CircuitScape (random walk / resistance distance). The gdistance package also contains specific functionality for geographical genetic analyses. The package imple- ments measures to model dispersal histories first presented by Van Etten and Hijmans (2010). Section 9 below introduces with an example how gdistance can be used in geographical genetics. Package gdistance uses functionality from a number of other R packages. The most important among these packages is raster. To use gdistance and to understand the details of this vignette, the reader has to be familiar with the basic functionality of raster. 2 Transition* classes To make distance calculations as flexible as possible, distances and other measures are calculated in various steps. The central classes in gdistance are the S4 classes TransitionLayer and TransitionStack. Most operations have an object of one of these classes either as input or output. Transition* objects can be constructed from an object of class RasterLayer, RasterStack or RasterBrick. These classes are from raster, a memory- efficient and user-friendly R package which contains complete geographical grid functionality. The class Transition* takes the necessary geographic ref- erences (projection, resolution, extent) from the original Raster* object. It also contains a matrix which specifies the probability of movements between cells or, in more general terms, the ‘conductance’ of inter-cell connections. Each row and column in the matrix represents a cell in the original Raster* object. Row/column 1 in the transition matrix corresponds to cell 1 in the original raster, and so on. Cell numbers in rasters go from left to right and from top to bottom. For instance, a 3 x 3 raster would have the following cell numbers: 1 2 3 4 5 6 7 8 9 This raster would produce a 9 x 9 transition matrix with rows/columns num- bered from 1 to 9. Using conductance values may be a bit confusing at first. Other soft- ware generally uses friction surfaces to calculate distances. Also, distances
  • 3. Jacob van Etten 3 are calculated in terms of accumulated friction or cost. 1 However, the re- lation between conductance and friction is straightforward: conductance is the reciprocal of friction (1/friction). It is not strange to use the word ‘con- ductance’ in this context (or to use resistance as a synonym for friction). There is an analogy between random walks on geographical grids and elec- trical current in a mesh of resistors (McRae et al. 2008). Calculations of ‘resistance distance’ (see below) take advantage of this analogy. Another ad- vantage of using conductance is that it makes it possible to store the values very efficiently as a so-called sparse matrix. Sparse matrices only record the non-zero values and information about their location in the matrix. In most cases, cells are connected only with adjacent cells. Consequently, a conduc- tance matrix contains only a small fraction of non-zero values, which occupy little memory in a sparse matrix format. The package gdistance makes use of sparse matrix classes and methods from the package Matrix, which gives access to fast procedures implemented in the C language. A first step in any analysis with gdistance is the construction of an object of the class Transition*. The construction of a Transition* object from a Raster* object is straightforward. We can define an arbitrary function to calculate the conductance values from the values of each pair of cells to be connected. Here, we create a raster with 10 degree cells and set it cells to random values between 0.4 and 0.6. We then create a TransitionLayer object. The transition value between each pair of cells is the mean of the two cell values. > library(gdistance) raster version 1.9-19 (22-September-2011) > r <- raster(nrows = 18, ncols = 36) > r <- setValues(r, runif(ncell(r), min = 0.4, max = 0.6)) > r class : RasterLayer dimensions : 18, 36, 648 (nrow, ncol, ncell) resolution : 10, 10 (x, y) extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax) coord. ref. : +proj=longlat +datum=WGS84 values : in memory min value : 0.4000163 max value : 0.5995257 1 ‘Permeability’ is a synonym of conductance. Impedance, resistance, cost and friction are used interchangeably to denote the contrary.
  • 4. 4 gdistance: geographic distances and routes > tr1 <- transition(r, transitionFunction = mean, + directions = 8) We set the directions argument to value 8. This connects all adjacent cells in 8 directions. Cells can also be connected in 4 or 16 connections. In chess terms, setting directions to 4 connects all cells with all possible one-cell rook movements (producing ‘Manhattan’ distances), while setting directions to 8 connects with one-cell queen movements. With 16 directions, all cells are connected with both one-cell queen movements and one-turn knight movements. This can make distance calculations more accurate.2 If we inspect the object we created, we see that the resulting TransitionLayer object keeps much information from the original RasterLayer object. > tr1 class : TransitionLayer dimensions : 18, 36, 648 (nrow, ncol, ncell) resolution : 10, 10 (x, y) extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax) coord. ref. : +proj=longlat +datum=WGS84 values : conductance matrix class: dsCMatrix It is also possible to create asymmetric matrices, in which the conductance from i to j is not always the same as the conductance from j back to i. This is relevant, among other things, for modelling travel in hilly terrain, as shown in Example 1 below. On the same slope, a downslope traveler experiences less resistance than an upslope traveler. In this case, the function to calculate conductance values is non-commutative: f(i, j) = f(j, i). The symm argument in transition needs to be set to FALSE. > ncf <- function(x) max(x) - x[1] + x[2] > tr2 <- transition(r, ncf, 4, symm = FALSE) > class(transitionMatrix(tr1)) [1] "dsCMatrix" attr(,"package") [1] "Matrix" 2 Connecting in 16 directions was inspired by the function r.cost in GRASS 6, and the documentation of this function illustrates nicely why connecting in 16 directions can in- crease the accuracy of the calculations http://grass.itc.it/grass64/manuals/html64_ user/r.cost.html. Also, see the section on distance transforms in de Smith et al. (2009).
  • 5. Jacob van Etten 5 > class(transitionMatrix(tr2)) [1] "dgCMatrix" attr(,"package") [1] "Matrix" The sparse matrix class dsCMatrix is symmetric and contains only half of the matrix. The class dgCMatrix can hold an asymmetric matrix. Different mathematical operations can be done with Transition* objects. This makes it possible to flexibly model different components of landscape friction. > tr3 <- tr1 * tr2 > tr3 <- tr1 + tr2 > tr3 <- tr1 * 3 > tr3 <- sqrt(tr1) Operations with more than one object require that the different objects have the same resolution and extent. Also, it is possible to extract and replace values in the matrix using in- dices. > tr3[cbind(1:9, 1:9)] <- tr2[cbind(1:9, 1:9)] > tr3[1:9, 1:9] <- tr2[1:9, 1:9] > tr3[1:5, 1:5] 5 x 5 sparse Matrix of class "dgCMatrix" [1,] . 0.5833925 . . . [2,] 0.478976 . 0.6168308 . . [3,] . 0.5311842 . 0.4703616 . [4,] . . 0.6776534 . 0.5776266 [5,] . . . 0.4703616 . Some functions require that Transition* objects do not contain any isolated ‘clumps’. This can be avoided when creating Transition* objects, for instance by giving conductance values between all adjacent cells some minimum value. Also, it can be checked visually. Here are a few ways to visualize a Transition* object. You can extract the transition matrix with function transitionMatrix. This gives a sparse matrix which can be vizualized with function image. This shows the rows and columns of the transition matrix and indicates which has a non-zero value (“connection”) as a black dot (Figure 1).
  • 6. 6 gdistance: geographic distances and routes > image(transitionMatrix(tr1)) Dimensions: 648 x 648 Column Row 100 200 300 400 500 600 100 200 300 400 500 600 Figure 1: Visualizing a TransitionLayer with function image() In Figure 1, the central diagonal line represents cell connections between raster cells in the same row. The two adjacent lines represent cells in the same column in the raster, separated by the length of the row in the raster. The curious dots around these lines have to do with the special character of the original raster. Since this raster covers the whole world, the outer meridians touch each other. The software takes this into account and as a result the cells in the extreme left column are connected to the extreme right column! The diagonal connections between the extreme columns explain the isolated dots in Figure 1. Figure 1 shows which cells contain non-zero values, but gives no further information about levels of conductance. However, we can transform the transition matrix back in to a raster to visualize this. To summarize the information in transition matrix, we can take means or sums across rows or columns, for instance. You can do this with function raster. Applied to a TransitionLayer, this function converts it to a RasterLayer. For the different options see method?raster("TransitionLayer"). The default, shown in Figure 2, takes the column-wise means of the non-zero values. All these forms of transformation unavoidably cause information loss, of course.
  • 7. Jacob van Etten 7 > plot(raster(tr1)) −150 −100 −50 0 50 100 150 −50050 0.44 0.46 0.48 0.50 0.52 0.54 0.56 Figure 2: Visualizing a TransitionLayer using the function raster() 3 Correcting inter-cell conductance values The function transition calculates transition values based on the val- ues of adjacent cells in the input raster. However, the centres of diagonally connected cells are more remote from each other than in the case of orthogo- nally connected cells. Secondly, on equirectangular (lonlat) projection grids, W-E connections are longer at the equator and become shorter towards the poles. Therefore, the values in the matrix need to be corrected for these two types of distortion. Both types of distortion can be corrected by dividing each conductance matrix value between the inter-cell distance. This is what function geoCorrection does for us. > tr1CorrC <- geoCorrection(tr1, type = "c", multpl = FALSE) > tr2CorrC <- geoCorrection(tr2, type = "c", multpl = FALSE) For random walks on longlat grids, there is an additional consideration to be made. The number of connections in N-S direction remains equal when moving from the equator to the poles. This is problematic, because random walks can be seen as analogous to electrical current through a networks of resistors. The inter-cell connections should be thought of as parallel resistors. Moving away from the equator, the inter-meridian space each individual re- sistor bridges becomes narrower, tending to zero at the poles. Therefore, the
  • 8. 8 gdistance: geographic distances and routes N-S resistance between parallels should decrease when moving away from the equator. The function geoCorrection corrects this distortion by multiplying the N-S transition values with the cosine of the average latitude of the cell centres. This is done with function geoCorrection, by setting the argument type to “r”, > tr1CorrR <- geoCorrection(tr1, type = "r", multpl = FALSE) When similar Transition* objects with equal resolution and extent need to be corrected repetitively, computational effort may be reduced by prepar- ing an object that only needs to be multiplied with the Transition* object to produce a corrected version of it. The following is equivalent to the previous procedure. > tr1CorrMatrix <- geoCorrection(tr1, type = "r", + multpl = TRUE) > tr1CorrR <- tr1 * tr1CorrMatrix Object trCorr1Matrix is only calculated once. It can be multiplied with Transition* objects, as long as they have the same extent, resolution, and directions of cell connections. We need to take special care that the geo- correction multiplication matrix (tr1CorrMatrix) contains all non-zero val- ues that are present in the Transition* object with which it will be multi- plied (tr1).3 4 Calculating distances With the corrected Transition* object we can calculate distances be- tween points. It is important to note that all distance functions require a Transition* object with conductance values, even though distances will be expressed in 1/conductance (friction or resistance) units. To calculate distances, we need to have the coordinates of point locations. This is done by creating a two-row matrix of coordinates. Functions will also accept a SpatialPoints object or, if there is only one point, a vector of length two. > sP <- cbind(c(65, 5, -65), c(55, 35, -35)) Calculating a distance matrix is straightforward now. 3 A good alternative is to use geoCorrection(mulpl=FALSE) with a Transition* object with cells connected with value 1.
  • 9. Jacob van Etten 9 > costDistance(tr1CorrC, sP) 1 2 2 9805972 3 31745174 22335047 > costDistance(tr2CorrC, sP) [,1] [,2] [,3] [1,] 0 11415775 34951321 [2,] 10816497 0 26753977 [3,] 33813271 26327784 0 > resistanceDistance(tr1CorrR, sP) 1 2 2 3012.742 3 3599.843 3317.169 5 Dispersal paths To determine dispersal paths with a random element, we use the function passage. This function can be used for both random walks and randomised shortest paths. The function calculates the number of passages through cells before arriving in the destination cell. Either the total or net number of passages can be calculated. The net number of passages is the number of passages that are not reciprocated by a passage in the opposite direction. Figure 3 shows the probability of passage through each cell, assuming randomised shortest paths with the parameter theta set to 2. Here we see again how the package takes into account that we are dealing with a raster that covers the whole world. The route crosses the 180 degrees meridian without any problem.
  • 10. 10 gdistance: geographic distances and routes > origin <- SpatialPoints(cbind(105, -55)) > goal <- SpatialPoints(cbind(-105, 55)) > rSPraster <- passage(tr1, origin, goal, theta = 2) > plot(rSPraster) −150 −100 −50 0 50 100 150 −50050 0.0 0.2 0.4 0.6 0.8 1.0 Figure 3: Probability of passage 6 Path overlap and non-overlap One of the specific uses, for which package gdistance was created, is to look at trajectories coming from the same source (van Etten and Hijmans 2010). The degree of coincidence of two trajectories can be visualized by multi- plying the probabilities of passage. > r1 <- passage(tr1, origin, sP[1, ], theta = 2) > r2 <- passage(tr1, origin, sP[2, ], theta = 2) > rJoint <- r1 * r2
  • 11. Jacob van Etten 11 > plot(rJoint) −150 −100 −50 0 50 100 150 −50050 0.0 0.2 0.4 0.6 0.8 1.0 Figure 4: Overlapping part of the two routes > rDiv <- max(max(r1, r2) * (1 - min(r1, r2)) - + min(r1, r2), 0) With the function pathInc() we can calculate measures of path overlap and non-overlap for a large number of points. These measures can be used to predict patterns of diversity if these are due to dispersal from a single common source (van Etten and Hijmans 2010). If the argument type contains two elements (divergent and joint), the result is a list of distances matrices. > pathInc(tr1CorrC, sP[1, ], sP[2:3, ], type = c("divergent", + "joint")) $divergent 1 2 36010415 $joint 1 2 583857.4
  • 12. 12 gdistance: geographic distances and routes > plot(rDiv) −150 −100 −50 0 50 100 150 −50050 0.0 0.2 0.4 0.6 0.8 1.0 Figure 5: Non-overlapping part of the two routes 7 Example 1: Hiking around Maunga Whau The previous examples were somewhat theoretical, based on randomly generated values. More realistic examples serve to illustrate the various uses that can be given to this package. Determining the fastest route between two points in complex terrain is useful for hikers. Tobler’s Hiking Function provides a rough estimate for the the maximum hiking speed given the slope of the terrain (Tobler 1993). The maximum speed of off-path hiking (in m/s) is: speed = exp(-3.5 * abs(slope + 0.05)) Note that the function is not symmetric around 0 (see Figure 6).
  • 13. Jacob van Etten 13 −1.0 −0.5 0.0 0.5 1.0 0.00.20.40.60.81.0 slope speed(m/s) Figure 6: Tobler’s Hiking Function We use the Hiking Function to determine the shortest path to hike around the volcano Maunga Whau (Auckland, New Zealand). First, we read in the altitude data for the volcano. This is a geo-referenced version of the“volcano” data available in Base R datasets (see ?volcano for more information). > r <- raster(system.file("external/maungawhau.grd", + package = "gdistance")) The Hiking Function requires the slope as input. slope = difference in height / distance travelled The units of height and distance should be identical. Here, we use meters for both. We calculate the height differences between cells first. Then we use the function geoCorrection() to divide by the distance between cells. > heightDiff <- function(x){x[2] - x[1]} > hd <- transition(r,heightDiff,8,symm=FALSE) > slope <- geoCorrection(hd, scl=FALSE) Subsequently, we calculate the speed. We need to exercise special care, be- cause the matrix values between non-adjacent cells is 0, but the slope between
  • 14. 14 gdistance: geographic distances and routes these cells is not 0! Therefore, we need to restrict the calculation to adjacent cells. We do this by creating an index for adjacent cells (adj) with the func- tion adjacency(). Using this index, we extract and replace adjacent cells, without touching the other values. > adj <- adjacency(x = r, fromCells = 1:ncell(r), + toCells = 1:ncell(r), directions = 8) > speed <- slope > speed[adj] <- exp(-3.5 * abs(slope[adj] + 0.05)) Now we have calculated the speed of movement between adjacent cells. We are close to having the final conductance values. Attainable speed is a mea- sure of the ease of crossing from one cell to another on the grid. However, we also need to take into account the distance between cell centres. Travelling with the same speed, a diagonal connection between cells takes longer to cross than a straight connection. Therefore, we use the function geoCorrection() again! > x <- geoCorrection(speed, scl = FALSE) This gives our final ”conductance” values. What do these ”conductance”values mean? The function geoCorrection() divides the values in the matrix with the distance between cell centres. So, with our last command we calculated this: conductance = speed / distance This looks a lot like a measure that we are more familiar with: travel time = distance / speed In fact, the conductance values we have calculated are the reciprocal of travel time. 1 / travel time = speed / distance = conductance Maximizing the reciprocal of travel time is exactly equivalent to minimizing travel time! Now we define two coordinates, A and B, and determine the paths be- tween them. We test if the quickest path from A to B is the same as the quickest path from B back to A.
  • 15. Jacob van Etten 15 2667400 2667600 2667800 2668000 6478800647900064792006479400 100 120 140 160 180 A B Figure 7: Quickest hiking routes around Maunga Whau > A <- c(2667670, 6479000) > B <- c(2667800, 6479400) > AtoB <- shortestPath(x, A, B, output = "SpatialLines") > BtoA <- shortestPath(x, B, A, output = "SpatialLines") > plot(r) > lines(AtoB, col = "red", lwd = 2) > lines(BtoA, col = "blue") > text(A[1] - 10, A[2] - 10, "A") > text(B[1] + 10, B[2] + 10, "B") A small part of the A-B (red) and B-A (blue) lines in the figure do not overlap. This is a consequence of the asymmetry of the Hiking Function.
  • 16. 16 gdistance: geographic distances and routes 8 Example 2: Geographical genetics The direct relation between genetic and geographic distances is known as isolation by distance (Wright 1943). Recent work has expanded this relation- ship to random movement in heterogeneous landscapes (McRae 2006). Also, the geography of dispersal routes can explain observed geospatial patterns of genetic diversity. For instance, diffusion from a single origin (Africa) ex- plains much of the current geographical patterns of human genetic diversity (Ramachandran 2005). As a result, the mutual genetic distance between a pair of humans from different parts from the globe depends on the extent they share their prehistoric migration history. Within a single continent, however, human genetic diversity may have to do with more recent events. Let’s look at diversity in Europe, using the data presented by Balaresque et al. (2010). Within Europe, genetic diversity is often thought to be a result of the migration of early Neolithic farmers from Anatolia (Turkey) to the west. First we read in the data, including the coordinates of the populations (Figure 8) and mutual genetic distances. > Europe <- raster(system.file("external/Europe.grd", + package = "gdistance")) > Europe[is.na(Europe)] <- 0 > data(genDist) > data(popCoord) > pC <- as.matrix(popCoord[c("x", "y")]) Then we create three geographical distance matrices. The first corre- sponds to the great-circle distance between populations. The second is the least-cost distance between locations. Travel is restricted to the land mass. The third is the resistance distance (using the same conductance matrix), which is related to the random-walk travel time between points (McRae 2006). > geoDist <- pointDistance(pC, longlat = TRUE) > geoDist <- as.dist(geoDist) > Europe <- aggregate(Europe, 3) > tr <- transition(Europe, mean, directions = 8) > tr <- geoCorrection(tr, "c") > tR <- geoCorrection(tr, "r") > cosDist <- costDistance(tr, pC) > resDist <- resistanceDistance(tR, pC) > cor(genDist, geoDist)
  • 17. Jacob van Etten 17 −20 −10 0 10 20 30 40 3040506070 0.0 0.2 0.4 0.6 0.8 1.0 DK EN1 EN2 FR1 FR2 FR3 FR4 FR7 GE GE1 GR IR IT1 IT2 NL SB SP2 SP4 SP5 SP6 TK1 TK2TK3 Figure 8: Map of genotyped populations
  • 18. 18 gdistance: geographic distances and routes [1] 0.5962655 > cor(genDist, cosDist) [1] 0.5889319 > cor(genDist, resDist) [1] -0.05532471 Interestingly, the great-circle distance between points turns out to be the best predictor of genetic distance. The other distance measures incorporate more information about the geographic space in which geneflow takes place, but do not improve the prediction. But how well does a wave of expansion from Anatolia explain the spatial pattern? > origin <- unlist(popCoord[22, c("x", "y")]) > pI <- pathInc(tr, origin = origin, fromCoords = pC, + type = "joint", theta = 2) > cor(genDist, pI) [1] NA At least at first sight, the overlap of dispersal routes explain the spatial pat- tern better than any of the previous measures. The negative sign of the last correlation coefficient was expected, as more overlap in routes is associ- ated with lower genetic distance. While additional work would be needed to improve predictions and compare the different models more rigorously, the promise of dispersal modelling with gdistance is clear. 9 Final remarks Questions about the use of gdistance can be posted on the r-sig-geo email list. Bug reports and requests for additional functionality can be mailed to jacobvanetten@yahoo.com. 10 References Balaresque P., et al. 2010. A predominantly Neolithic origin for European paternal lineages. PLoS Biology 8(1): e1000285.
  • 19. Jacob van Etten 19 de Smith, M.J., M.F. Goodchild, and P.A. Longley. 2009. Geospatial Anal- ysis. Matador. 3rd edition. McRae B.H. 2006. Isolation by resistance. Evolution 60: 1551–1561. McRae B.H., B.G. Dickson, and T. Keitt. 2008. Using circuit theory to model connectivity in ecology, evolution, and conservation. Ecology 89:2712-2724. Ramachandran S., et al. 2005. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. PNAS 102: 15942–15947. Saerens M., L. Yen, F. Fouss, and Y. Achbany. 2009. Randomized shortest- path problems: two related models. Neural Computation, 21(8):2363- 2404. Tobler W. 1970. A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(2): 234-240. Tobler W. 1993. Three Presentations on Geographical Analysis and Mod- eling. http://www.ncgia.ucsb.edu/Publications/Tech_Reports/ 93/93-1.PDF van Etten, J., and R.J. Hijmans. 2010. A geospatial modelling approach in- tegrating archaeobotany and genetics to trace the origin and dispersal of domesticated plants. PLoS ONE 5(8): e12060. Wright, S. 1943. Isolation by distance. Genetics 28: 114–138.