knitr::opts_chunk$set(echo = TRUE)
Ecological Trajectory Analysis (ETA) is a framework to analyze ecosystem dynamics described as trajectories in a chosen space of ecosystem resemblance. ETA takes trajectories as objects to be analyzed and compared geometrically.
The framework was presented for community ecology in:
The framework was then extended with new metrics and visualisation modes in:
Procedures of trajectory analysis can be applied to data beyond community data tables. For example, the same framework was applied to stable isotope data in:
Since it can be applied to multiple spaces, we now call the framework Ecological Trajectory Analysis and provide a package ecotraj that offers a set of functions to calculate metrics and produce plots.
In this vignette you will learn how to conduct ETA using different package functions. In most of the vignette we describe how to study the trajectories of three sites that have been surveyed four times each. We use a small dataset where trajectories occur in a space of two dimensions, so that geometric calculations can be followed more easily. In the last section a real example is presented.
First of all, we load ecotraj
:
library(ecotraj)
To specify ecosystem dynamics, the following data items need to be distinguished:
a. A set of ecosystem states (i.e. coordinates in a space $\Omega$) implicitly described using a distance matrix $d$. b. A character vector specifying the site (i.e. sampling unit) corresponding to each ecosystem state. Trajectory names are identified from unique values of sites. c. An integer vector specifying the survey (i.e. census) corresponding to the sampling of each ecosystem state. This vector is important for survey order. If not provided, the order will be assumed to be incremental for each repetition of site value. d. A numeric vector specifying the survey time corresponding to each ecosystem state. This is needed to evaluate speeds.
In ETA, sampling units do not need to be surveyed synchronously nor the same number of times.
Let us first define the vectors that describe the site and the survey of each ecosystem state:
#Description of sites and surveys sites <- c("1","1","1","1","2","2","2","2","3","3","3","3") surveys <- c(1,2,3,4,1,2,3,4,1,2,3,4)
We then define a matrix whose coordinates correspond to the set of ecosystem states observed. The number of rows in this matrix has to be equal to the length of vectors sites
and surveys
. We assume that the ecosystem space $\Omega$ has two dimensions:
#Raw data table xy<-matrix(0, nrow=12, ncol=2) xy[2,2]<-1 xy[3,2]<-2 xy[4,2]<-3 xy[5:6,2] <- xy[1:2,2] xy[7,2]<-1.5 xy[8,2]<-2.0 xy[5:6,1] <- 0.25 xy[7,1]<-0.5 xy[8,1]<-1.0 xy[9:10,1] <- xy[5:6,1]+0.25 xy[11,1] <- 1.0 xy[12,1] <-1.5 xy[9:10,2] <- xy[5:6,2] xy[11:12,2]<-c(1.25,1.0) xy
The matrix of Euclidean distances $d$ between ecosystem states in $\Omega$ is then:
#Distance matrix d <- dist(xy) d
ETA is based on the analysis of information in the distance matrix $\Delta = [d]$. Therefore, it does not require explicit coordinates. This is an advantage because it allows the analysis to be conducted on arbitrary metric (or semi-metric) spaces. The choice of $d$ is left to the user and will depend on the problem at hand.
To perform ETA, we need to combine the distance matrix and the site/survey information in a single object using function defineTrajectories()
:
x <- defineTrajectories(d, sites, surveys)
Note that surveys
may be omitted, and in this case site surveys are assumed to be in order. The function returns an object (a list) of class trajectories
that contains all the information for analysis:
class(x)
This object contains two elements:
names(x)
Element d
contains the input distance matrix, whereas metadata
is a data frame including information of observations:
x$metadata
Note that columns surveys
and times
have exactly the same values. This happens because we did not supplied a vector for times
so that surveys are assumed to happen every time step (of whatever units). Moreover, the surveys
vector itself can be omitted in calls to defineTrajectories()
. If so, the function will (correctly, in this case) interpret that every repetition of a given site corresponds to a new survey:
x <- defineTrajectories(d, sites) x$metadata
Let us assume the following sampling times, in units of years:
times <- c(1.0,2.2,3.1,4.2,1.0,1.5,2.8,3.9,1.6,2.8,3.9,4.3)
The call to defineTrajectories()
using all the information would be:
x <- defineTrajectories(d, sites, surveys, times) x$metadata
At some point in the ETA, one may desire to focus on particular trajectories or surveys. Function subsetTrajectory()
allows subsetting objects of class trajectories
, For example, we can decide to work with the second and third trajectories:
x23 <- subsetTrajectories(x, site_selection = c("2", "3")) x23
We can decide to focus on the last three surveys:
x23s <- subsetTrajectories(x, site_selection = c("2", "3"), survey_selection = c(2,3,4)) x23s
You will notice that surveys
have been renumbered (but times
are not modified). This illustrates that the vector surveys
is only used to indicate the survey order within each trajectory.
To begin our analysis of the three trajectories, we display them in an ordination space, using function trajectoryPCoA()
. Since $\Omega$ has only two dimensions in this example, the Principal Coordinates Analysis (PCoA) on $d$ displays the complete space:
trajectoryPCoA(x, traj.colors = c("black","red", "blue"), lwd = 2, survey.labels = T) legend("topright", col=c("black","red", "blue"), legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"), bty="n", lty=1, lwd = 2)
While trajectory of site '1' (black arrows) is made of three segments of the same length and direction, trajectory of site '2' (red arrows) has a second and third segments that bend and are shorter than that of the second segment of site '1'. Trajectory of site '3' includes a stronger change in direction and shorter segments.
As this example has two dimensions and we used Euclidean distance, the same plot (albeit rotated) can be straightforwardly obtained using matrix xy
and function trajectoryPlot()
:
trajectoryPlot(xy, sites, surveys, traj.colors = c("black","red", "blue"), lwd = 2, survey.labels = T) legend("topright", col=c("black","red", "blue"), legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"), bty="n", lty=1, lwd = 2)
While trajectoryPCoA()
uses PCoA (also known as classical Multidimensional Scaling) to display trajectories, users can display ecosystem trajectories using other ordination techniques such as metric Multidimensional Scaling (mMDS; see function mds of package smacof) or non-metric MDS (nMDS; see function metaMDS in package vegan or function isoMDS in package MASS). Function trajectoryPlot()
will help drawing arrows between segments to represent trajectories on the ordination space given by any of these methods.
Functions trajectoryPCoA()
and trajectoryPlot()
can be used to display a subset of trajectories if we combine them with function subsetTrajectories()
:
trajectoryPCoA(subsetTrajectories(x, site_selection = c("2", "3")), traj.colors = c("red", "blue"), lwd = 2, survey.labels = T) legend("topright", col=c("red", "blue"), legend=c("Trajectory 2", "Trajectory 3"), bty="n", lty=1, lwd = 2)
One may be interested in studying the geometric properties of particular trajectories. This is illustrated in this section
Several metrics are related to distances between states. For example, one can obtain the length of trajectory segments and the total path length using function trajectoryLengths()
:
trajectoryLengths(x)
When survey times are available, it may be of interest to calculate segment or trajectory speeds. One can obtain the speed of trajectory segments and the total path speed using function trajectorySpeeds()
:
trajectorySpeeds(x)
Note that the units of lengths and speeds will depend on the definition of the $\Omega$ space and, in the latter case, on the units of times
.
Finally, one may calculate the overall (geometric) variability of states within each trajectory:
trajectoryVariability(x)
The geometric variability can be interpreted as the average squared distance to the centroid of each trajectory.
In CTA, angles are measured using triplets of time-ordered ecosystem states (a pair of consecutive segments is an example of such triplets). As matrix $\Delta$ may represent a space of multiple dimensions, angles cannot be calculated with respect to a single plane. Instead, each angle is measured on the plane defined by each triplet. Zero angles indicate that the three points (e.g. the two consecutive segments) are in a straight line. The larger the angle value, the more is trajectory changing in direction. Mean and standard deviation statistics of angles are calculated according to circular statistics. Function trajectoryAngles()
allows calculating the angles between consecutive segments:
trajectoryAngles(x)
While site '1' follows a straight path, angles are > 0 for trajectories of site '2' and '3', denoting the change in direction. In this case, the same information could be obtained by inspecting the previous plots, but in a case where $\Omega$ has many dimensions, the representation will correspond to a reduced (ordination) space and hence, angles and lengths in the plot will not correspond exactly to those of functions trajectoryLengths()
and trajectoryAngles()
, which take into account the complete $\Omega$ space.
Angles can be calculated not only for all consecutive segments but for all four triplets of ordered ecosystem states, whether of consecutive segments or not (i.e., between points 1-2-3, 1-2-4, 1-3-4 and 2-3-4). This is done by specifying all=TRUE
in trajectoryAngles()
:
trajectoryAngles(x, all=TRUE)
The mean resultant length of circular statistics (column rho
of the previous result), which takes values between 0 and 1, can be used to assess the degree of homogeneity of angle values and it will take a value of 1 if all angles are the same. This approach, however, uses only angular information and does not take into account the length of segments.
To measure the overall directionality of a ecosystem trajectory (i.e. if the path consistently follows the same direction in $\Omega$ ), we recommend using another statistic that is sensitive to both angles and segment lengths and is implemented in function trajectoryDirectionality()
:
trajectoryDirectionality(x)
As known from previous plots, trajectory of site '2' is less straight than trajectory of site '1' and that of site '3' has the lowest directionality value. By default the function only computes a descriptive statistic, i.e. it does not perform any statistical test on directionality. A permutational test can be performed, but this feature is experimental needs to be tested before recommendation.
It is possible to assess multiple trajectory metrics in one function call to trajectoryMetrics()
. This will only provide metrics that apply to the whole trajectory path:
trajectoryMetrics(x)
Another function, called trajectoryWindowMetrics()
calculates trajectory metrics on moving windows over trajectories, but will not be illustrated here.
Ecosystem states occupy a position within their trajectory that depends on the total path length of the trajectory (see Fig. 2 of De Cáceres et al. 2019). By adding the length of segments prior to a given state and dividing the sum by the total length of the trajectory we obtain the relative position of the ecosystem state. Function trajectoryProjection()
allows obtaining the relative position of each point of a trajectory. To use it for this purpose one should use as parameters the distance matrix between states and the indices that conform the trajectory, which have to be entered twice. For example for the two example trajectories we would have:
trajectoryProjection(d, 1:4, 1:4)
If we inspect the relative positions of the points in the trajectory of site '2', we find than the second and third segments have relative positions larger than 1/3 and 2/3, respectively, because the second and third segments are shorter:
trajectoryProjection(d, 5:8, 5:8)
Function trajectoryProjection()
can also be used to project arbitrary ecosystem states onto a given reference trajectory. For example we can study the projection of third state of the trajectory of site '2' (i.e. state 7) onto the trajectory of site '1' (i.e. states 1 to 4), which happens to be in the half of the trajectory:
trajectoryProjection(d, 7, 1:4)
If we project the points of the trajectory of site '3' onto the trajectory of site '1' we see how the curved path of site '3' projects its fourth point to the same relative position as its second point.
trajectoryProjection(d, 9:12, 1:4)
When trajectories have been sampled the same number of times, one can study their symmetric convergence or divergence (see Fig. 3a of De Cáceres et al. 2019). Function trajectoryConvergence()
allows performing tests of convergence based on the trend analysis of the sequences of distances between points of the two trajectories (i.e. first-first, second-second, ...):
trajectoryConvergence(x, symmetric = TRUE)
The function performs the Mann-Whitney trend test. Values of the statistic ('tau') larger than 0 correspond to trajectories that are diverging, whereas values lower than 0 correspond to trajectories that are converging. By setting symmetric = FALSE
the convergence test becomes asymmetric (see Figs. 3b and 3c of De Cáceres et al. 2019). In this case the sequence of distances between every point of one trajectory and the other:
trajectoryConvergence(x, symmetric = FALSE)
The asymmetric test is useful to determine if one trajectory is becoming closer to the other or if it is departing from the other.
To start comparing trajectories between sites (i.e. between sampling units), one important step is the calculation of distances between directed segments (see Fig. 4 of De Cáceres et al. 2019), which can be obtained by calling function segmentDistances
:
Ds <- segmentDistances(x)$Dseg Ds
Distances between segments are affected by differences in both position, size and direction. Hence, among the six segments of this example, the distance is maximum between the last segment of trajectory '1' (named 1[3-4]
) and the first segment of trajectory '3' (named 3[1-2]
).
One can display distances between segments in two dimensions using mMDS.
mMDS <- smacof::mds(Ds) mMDS xret <- mMDS$conf plot(xret, xlab="axis 1", ylab = "axis 2", asp=1, pch=21, bg=c(rep("black",3), rep("red",3), rep("blue",3)), xlim=c(-1.5,1), ylim=c(-1,1.5)) text(xret, labels=rep(paste0("s",1:3),3), pos=1) legend("topleft", pt.bg=c("black","red","blue"), pch=21, bty="n", legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"))
Distances between segments are internally calculated when comparing whole trajectories using function trajectoryDistances()
. Here we show the dissimilarity between the two trajectories as assessed using either the Hausdorff distance (equal to the maximum distance between directed segments; see eq. 8 in De Cáceres et al. 2019) or the directed segment path distance (an average of distances between segments; see eq. 13 in De Cáceres et al. 2019):
trajectoryDistances(x, distance.type = "Hausdorff") trajectoryDistances(x, distance.type = "DSPD")
DSPD is a symmetrized distance. To calculate non-symmetric distances one uses (see eq. 11 in De Cáceres et al. 2019):
trajectoryDistances(x, distance.type = "DSPD", symmetrization = NULL)
Sometimes different ecosystems follow the same or similar trajectory but with different speeds, or where sampling starts at a different point in the dynamic sequence. We can quantify those differences using function trajectoryShifts()
. To illustrate this function, we will first build a small dataset of three linear trajectories, but where the second and the third are modified:
#Description of sites and times sites2 <- c("1","1","1","1","2","2","2","2","3","3","3","3") times2 <- c(1,2,3,4,1,2,3,4,1,2,3,4) #Raw data table xy2<-matrix(0, nrow=12, ncol=2) xy2[2,2]<-1 xy2[3,2]<-2 xy2[4,2]<-3 xy2[5:8,1] <- 0.25 xy2[5:8,2] <- xy2[1:4,2] + 0.5 # States are all shifted with respect to site "1" xy2[9:12,1] <- 0.5 xy2[9:12,2] <- xy2[1:4,2]*1.25 # 1.25 times faster than site "1"
We can see the differences graphically:
trajectoryPlot(xy2, sites2, traj.colors = c("black","red", "blue"), lwd = 2) legend("topright", col=c("black","red", "blue"), legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"), bty="n", lty=1, lwd = 2)
We now build the usual trajectories
object:
x2 <- defineTrajectories(dist(xy2), sites = sites2, times = times2)
We can check that indeed the third trajectory is faster using:
trajectorySpeeds(x2)
Function trajectoryShifts()
allows comparing different observations to a reference trajectory. For example we can compare trajectory for sites "1" and "2":
trajectoryShifts(subsetTrajectories(x2, c("1","2")))
Where we see that the observations of trajectory "2" correspond to states of trajectory "1" at 0.5 time units later in time. Surveys with missing values indicate that the projection of the target state cannot be determined (because the reference trajectory is too short).
We can also compare trajectories "1" and "3":
trajectoryShifts(subsetTrajectories(x2, c("1","3")))
Here we see that shifts increase progressively, indicating the faster speed of trajectory "3".
In this example we analyze the dynamics of 8 permanent forest plots located on slopes of a valley in the New Zealand Alps. The study area is mountainous and centered on the Craigieburn Range (Southern Alps), South Island, New Zealand (see map in Fig. 8 of De Cáceres et al. 2019). Forests plots are almost monospecific, being the mountain beech (Fuscospora cliffortioides) the main dominant tree species. Previously forests consisted of largely mature stands, but some of them were affected by different disturbances during the sampling period (1972-2009) which includes 9 surveys. We begin our example by loading the data set, which includes 72 plot observations:
data("avoca")
Community data is in form of an object stratifiedvegdata
. To account for differences in tree diameter, while emphasizing regeneration, the data contains individual counts to represent tree abundance and trees are classified in 19 quadratic diameter bins (in cm): {(2.25, 4], (4, 6.25], (6.25, 9], ... (110.25, 121]}. The data set also includes vectors avoca_surveys
and avoca_sites
that indicate the survey and forest plot corresponding to each forest state.
Before starting ETA, we have to use function vegdiststruct
from package vegclust to calculate distances between forest plot states in terms of structure and composition (see De Cáceres M, Legendre P, He F (2013) Dissimilarity measurements and the size structure of ecological communities. Methods Ecol Evol 4:1167–1177. https://doi.org/10.1111/2041-210X.12116):
avoca_D_man <- vegclust::vegdiststruct(avoca_strat, method="manhattan", transform = function(x){log(x+1)})
Distances in avoca_D_man
are calculated using the Manhattan metric, after applying a logarithm transformation to abundance data.
We start by defining our trajectories, which implies combining the information about distances, sites and surveys (survey times are omitted):
avoca_x <- defineTrajectories(d = avoca_D_man, sites = avoca_sites, surveys = avoca_surveys)
The distance matrix avoca_D_man
conforms our definition of $\Omega$. We use trajectoryPCoA()
to display the relations between forest plot states in this space and to draw the trajectory of each plot:
trajectoryPCoA(avoca_x, traj.colors = RColorBrewer::brewer.pal(8,"Accent"), axes=c(1,2), length=0.1, lwd=2) legend("topright", bty="n", legend = 1:8, col = RColorBrewer::brewer.pal(8,"Accent"), lwd=2)
Note that in this case, the full $\Omega$ includes more than two dimensions, and PCoA is representing 43% of total variance (correction for negative eigenvalues is included in the call to cmdscale
from trajectoryPCoA()
), so one has to be careful when interpreting trajectories visually.
Another option is to use mMDS to represent trajectories, which in this case produces a similar result:
mMDS <- smacof::mds(avoca_D_man) mMDS trajectoryPlot(mMDS$conf, avoca_sites, avoca_surveys, traj.colors = RColorBrewer::brewer.pal(8,"Accent"), axes=c(1,2), length=0.1, lwd=2) legend("topright", bty="n", legend = 1:8, col = RColorBrewer::brewer.pal(8,"Accent"), lwd=2)
plotTrajDiamDist<-function(cli = 7) { l = colnames(avoca_strat[[1]]) ncl = 14 m197072= avoca_strat[avoca_surveys==1][[cli]]["NOTCLI",2:ncl] m197072[m197072<1] = NA m1974 = avoca_strat[avoca_surveys==2][[cli]]["NOTCLI",2:ncl] m1974[m1974<1] = NA m1978 = avoca_strat[avoca_surveys==3][[cli]]["NOTCLI",2:ncl] m1978[m1978<1] = NA m1983 = avoca_strat[avoca_surveys==4][[cli]]["NOTCLI",2:ncl] m1983[m1983<1] = NA m1987 = avoca_strat[avoca_surveys==5][[cli]]["NOTCLI",2:ncl] m1987[m1987<1] = NA m1993 = avoca_strat[avoca_surveys==6][[cli]]["NOTCLI",2:ncl] m1993[m1993<1] = NA m1999 = avoca_strat[avoca_surveys==7][[cli]]["NOTCLI",2:ncl] m1999[m1999<1] = NA m2004 = avoca_strat[avoca_surveys==8][[cli]]["NOTCLI",2:ncl] m2004[m2004<1] = NA m2009 = avoca_strat[avoca_surveys==9][[cli]]["NOTCLI",2:ncl] m2009[m2009<1] = NA plot(m197072, type="l", ylim=c(1,200), log="y", xlab="", ylab="Number of individuals (log)", main=paste0("Trajectory ",cli), axes=FALSE, col=gray(0.8), lwd=2) axis(2, las=2) axis(1, at=1:(ncl-1), labels=l[2:ncl], las=2) lines(m1974, col=gray(0.7), lwd=2) lines(m1978, col=gray(0.6), lwd=2) lines(m1983, col=gray(0.5), lwd=2) lines(m1987, col=gray(0.4), lwd=2) lines(m1993, col=gray(0.3), lwd=2) lines(m1999, col=gray(0.2), lwd=2) lines(m2004, col=gray(0.1), lwd=2) lines(m2009, col=gray(0), lwd=2) legend("topright", bty="n", lwd=2,col=gray(seq(0.8,0, by=-0.1)), legend=c("1970/72","1974","1978","1983", "1987", "1993","1999","2004","2009")) }
One can inspect specific trajectories using subsetTrajectories()
. This allows getting a better view of particular trajectories, here that of forest plot '3':
oldpar <- par(mfrow=c(1,2)) trajectoryPCoA(subsetTrajectories(avoca_x, "3"), length=0.1, lwd=2, survey.labels = T) plotTrajDiamDist(3) par(oldpar)
In the right hand, we added a representation of the change in the mountain beech tree diameter distribution through time for trajectory of forest plot '3'. The dynamics of this plot include mostly growth, which results in individuals moving from one diameter class to the other. The whole trajectory looks mostly directional. Let's now inspect the trajectory of forest plot '4':
oldpar <- par(mfrow=c(1,2)) trajectoryPCoA(subsetTrajectories(avoca_x, "4"), length=0.1, lwd=2, survey.labels = T) plotTrajDiamDist(4) par(oldpar)
This second trajectory is less straight and seems to include a turn by the end of the sampling period, corresponding to the recruitment of new saplings.
While trajectory lengths and angles can be inspected visually in ordination diagrams, it is better to calculate them using the full $\Omega$ space (i.e., from matrix avoca_D_man
). Using function trajectoryLengths()
we can see that the trajectory of forest plot '4' is lengthier than that of plot '3', mostly because includes a lengthier last segment (i.e. the recruitment of new individuals):
trajectoryLengths(avoca_x)
If we calculate the angles between consecutive segments (using function trajectoryLengths
) we see that indeed the trajectory of '3' is rather directional, but the angles of trajectory of '4' are larger, on aveerage:
avoca_ang <- trajectoryAngles(avoca_x) avoca_ang
By calling function trajectoryDirectionality()
we can confirm that the trajectory for site '4' is less straight than that of site '3':
avoca_dir <- trajectoryDirectionality(avoca_x) avoca_dir
The following code displays the relationship between the statistic in trajectoryDirectionality()
and the mean resultant vector length that uses angular information only and assesses the constancy of angle values:
avoca_rho <- trajectoryAngles(avoca_x, all=TRUE)$rho plot(avoca_rho, avoca_dir, xlab = "rho(T)", ylab = "dir(T)", type="n") text(avoca_rho, avoca_dir, as.character(1:8))
We can calculate the resemblance between forest plot trajectories using trajectoryDistances()
:
avoca_D_traj_man <- trajectoryDistances(avoca_x, distance.type="DSPD") print(round(avoca_D_traj_man,3))
The closest trajectories are those of plots '1' and '2'. They looked rather close in position in the PCoA ordination of $\Omega$ with all trajectories, so probably it is position, rather than shape which has influenced this low value. The next pair of similar trajectories are those of the '3'-'5' pair. We can again use mMDS to produce an ordination of resemblances between trajectories:
mMDS<-smacof::mds(avoca_D_traj_man) mMDS x<-mMDS$conf[,1] y<-mMDS$conf[,2] plot(x,y, type="p", asp=1, xlab=paste0("Axis 1"), ylab=paste0("Axis 2"), col="black", bg= RColorBrewer::brewer.pal(8,"Accent"), pch=21) text(x,y, labels=1:8, pos=1)
Distances between trajectories can be calculated after centering them (i.e. after bringing all trajectories to the center of the $\Omega$ space). This is done using function centerTrajectories()
, which returns a new dissimilarity matrix and is illustrated in a different vignette.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.