Segmentation of a trajectory based on Markov models

Share:

Description

These functions partition a trajectory into several segments corresponding to different behaviours of the animal.
modpartltraj is used to generate the models to which the trajectory is compared.
bestpartmod is used to compute the optimal number of segments of the partition.
partmod.ltraj is used to partition the trajectory into npart segments. plot.partltraj can be used to plot the results.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
modpartltraj(tr, limod)
## S3 method for class 'modpartltraj'
print(x, ...)

bestpartmod(mods, Km = 30, plotit = TRUE,
            correction = TRUE, nrep = 100)

partmod.ltraj(tr, npart, mods, na.manage = c("prop.move","locf"))
## S3 method for class 'partltraj'
print(x, ...)
## S3 method for class 'partltraj'
plot(x, col, addpoints = TRUE, lwd = 2, ...)

Arguments

tr

an object of class ltraj containing only one trajectory (one burst of relocation)

limod

a list of syntactically correct R expression giving the models for the trajectory, implying one or several elements in tr (see details and examples)

x, mods

an object of class modpartltraj (for print.modpartltraj), partltraj (for print.partltraj and plot.partltraj) returned respectively by the function genmod.crw and partmod.ltraj

na.manage

a character string indicating what should be done with the missing values located between two segments. With "locf", the missing values are added at the end of the first segment. With "prop.move", the missing values are distributed at the end of the first and the beginning of the second segment. The proportion of missing values added at the end of the first segment correspond the relative proportion of "internal" missing values found within the segments predicted by the model used to predict the first segment.

npart

the number of partitions of the trajectory

Km

the maximum number of partitions of the trajectory

plotit

logical. Whether the results should be plotted.

correction

logical. Whether the log-likelihood should be corrected (see details).

nrep

logical. The number of Monte Carlo simulations used to correct the log-likelihood for each number of segments.

col

the colors to be used for the models

addpoints

logical. Whether the relocations should be added to the graph

lwd

the line width

...

additional arguments to be passed to other functions

Details

A trajectory is made of successive steps traveled by an organism in the geographical space. These steps (the line connecting two successive relocations) can be described by a certain number of descriptive parameters (relative angles between successive steps, length of the step, etc.). One aim of the trajectory analysis is to identify the structure of the trajectory, i.e. the parts of the trajectory where the steps have homogeneous properties. Indeed, an animal may have a wide variety of behaviours (feeding, traveling, escape from a predator, etc.). As a result, partitioning a trajectory occupies a central place in trajectory analysis.

These functions are to be used to partition a trajectory based on Markov models of animal movements. For example, one may suppose that a normal distribution generated the step lengths, with a different mean for each type of behaviour. These models and the value of their parameters are supposed a priori by the analyst. These functions allow, based on these a priori models, to find both the number and the limits of the segments building up the trajectory (see examples). Any model can be supposed for any parameter of the steps (the distance, relative angles, etc.), provided that the model is Markovian.

The rationale behind this algorithm is the following. First, the user should propose a set of model describing the movements of the animals, in the different segments of the trajectory. For example, the user may define two models of normal distribution for the step length, with means equal to 10 meters (i.e. a trajectory with relatively small steps) and 100 meters (i.e. a trajectory with longer step lengths). For a given step of the trajectory, it is possible to compute the probability density that the step has been generated by each model of the set. The function modpartltraj computes the matrix containing the probability densities associated to each step (rows), under each model of the set (columns). This matrix is of class modpartltraj.

Then, the user can estimate the optimal number of segments in the trajectory, given the set of a priori models, using the function bestpartmod, taking as argument the matrix of class modpartltraj. If correction = FALSE, this function returns the log of the probability (log-likelihood) that the trajectory is actually made of K segments, with each one described by one model. The resulting graph can be used to choose an optimal number of segment for the partition. Note that Gueguen (2007) noted that this algorithm tends to overestimate the number of segments in a trajectory. He proposed to correct this estimation using Monte Carlo simulations of the independence of the steps within the trajectory. At each step of the randomization process, the order of the rows of the matrix is randomized, and the curve of log-likelihood is computed for each number of segments, for the randomized trajectory. Then, the observed log-likelihood is corrected by these simulations: for a given number of segments, the corrected log-likelihood is equal to the observed log-likelihood minus the simulated log-likelihood. Because there is a large number of simulations of the independence, a distribution of corrected log-likelihoods is available for each number of segments. The "best" number of segments is the one for which the median of the distribution of corrected log-likelihood is maximum.

Finally, once the optimal number of segments npart has been chosen, the function partmod.ltraj can be used to compute the partition.

The mathematical rationale underlying these two functions is the following: given an optimal k-partition of the trajectory, if the ith step of the trajectory belongs to the segment k predicted by the model d, then either the relocation (i-1) belongs to the same segment, in which case the segment containing (i-1) is predicted by d, or the relocation (i-1) belongs to another segment, and the other (k-1) segments together constitute an optimal (k-1) partition of the trajectory 1-(i-1). These two probabilities are computed recursively by the functions from the matrix of class partmodltraj, observing that the probability of a 1-partition of the trajectory from 1 to i described by the model m (i.e. only one segment describing the trajectory) is simply the product of the probability densities of the steps from 1 to i under the model m. Further details can be found in Calenge et al. (in prep), and in Gueguen (2001, 2007).

Value

partmodltraj returns a matrix of class partmodltraj containing the probability densities of the steps of the trajectory (rows) for each model (columns).

bestpartmod returns a list with two elements: (i) the element mk is a vector containing the values of the log-probabilities for each number of segments (varying from 1 to Km), and (ii) the element correction contains either "none" or a matrix containing the corrected log-likelihood for each number of segments (rows) and each simulation of the independence (column).

partmod.ltraj returns a list of class partltraj with the following components: ltraj is an object of class ltraj containing the segmented trajectory (one burst of relocations per segment of the partition); stats is a list containing the following elements:

locs

The number ID of the relocations starting the segments (except the last one which ends the last segment)

Mk

The value of the cumulative log-probability for the Partition (i.e. the log-probability associated to a K-partition is equal to the log-probability associated to the (K-1)-partition plus the log-probability associated to the Kth segment)

mod

The number ID of the model chosen for each segment

which.mod

the name of the model chosen for each segment

Author(s)

Clement Calenge clement.calenge@oncfs.gouv.fr

References

Calenge, C., Gueguen, L., Royer, M. and Dray, S. (in prep.) Partitioning the trajectory of an animal with Markov models.

Gueguen, L. (2001) Segmentation by maximal predictive partitioning according to composition biases. Pp 32–44 in: Gascuel, O. and Sagot, M.F. (Eds.), Computational Biology, LNCS, 2066.

Gueguen, L. (in prep.) Computing the probability of sequence segmentation under Markov models.

See Also

ltraj

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
## Not run: 
## Example on the porpoise
data(porpoise)

## Keep the first porpoise
gus <- porpoise[1]
plot(gus)

## First test the independence of the step length
indmove(gus)
## There is a lack of independence between successive distances

## plots the distance according to the date
plotltr(gus, "dist")

## One supposes that the distance has been generated
## by normal distribution, with different means for the
## different behaviours
## The means of the normal distribution range from 0 to
## 130000. We suppose a standard deviation equal to 5000:

tested.means <- round(seq(0, 130000, length = 10), 0)
(limod <- as.list(paste("dnorm(dist, mean =",
                  tested.means,
                  ", sd = 5000)")))

## Build the probability matrix
mod <- modpartltraj(gus, limod)

## computes the corrected log-likelihood for each
## number of segments
bestpartmod(mod)

## The best number of segments is 4. Compute the partition:
(pm <- partmod.ltraj(gus, 4, mod))
plot(pm)


## Shows the partition on the distances:
plotltr(gus, "dist")

lapply(1:length(pm$ltraj), function(i) {
   lines(pm$ltraj[[i]]$date, rep(tested.means[pm$stats$mod[i]],
         nrow(pm$ltraj[[i]])),
         col=c("red","green","blue")[as.numeric(factor(pm$stats$mod))[i]],
         lwd=2)
})


## Computes the residuals of the partition
res <- unlist(lapply(1:length(pm$ltraj), function(i) {
   pm$ltraj[[i]]$dist - rep(tested.means[pm$stats$mod[i]],
         nrow(pm$ltraj[[i]]))
}))

plot(res, ty = "l")

## Test of independence of the residuals of the partition:
wawotest(res)


## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.