Segmentation of a trajectory based on Markov models
Description
These functions partition a trajectory into several segments
corresponding to different behaviours of the animal.
modpartltraj
is used to generate the models to which the trajectory
is compared.
bestpartmod
is used to compute the optimal number of segments
of the partition.
partmod.ltraj
is used to partition the trajectory into
npart
segments. plot.partltraj
can be used to plot the
results.
Usage
1 2 3 4 5 6 7 8 9 10 11 12  modpartltraj(tr, limod)
## S3 method for class 'modpartltraj'
print(x, ...)
bestpartmod(mods, Km = 30, plotit = TRUE,
correction = TRUE, nrep = 100)
partmod.ltraj(tr, npart, mods, na.manage = c("prop.move","locf"))
## S3 method for class 'partltraj'
print(x, ...)
## S3 method for class 'partltraj'
plot(x, col, addpoints = TRUE, lwd = 2, ...)

Arguments
tr 
an object of class 
limod 
a list of syntactically correct R expression giving the
models for the trajectory, implying one or several elements in

x, mods 
an object of class 
na.manage 
a character string indicating what should be done
with the missing values located between two segments. With

npart 
the number of partitions of the trajectory 
Km 
the maximum number of partitions of the trajectory 
plotit 
logical. Whether the results should be plotted. 
correction 
logical. Whether the loglikelihood should be corrected (see details). 
nrep 
logical. The number of Monte Carlo simulations used to correct the loglikelihood for each number of segments. 
col 
the colors to be used for the models 
addpoints 
logical. Whether the relocations should be added to the graph 
lwd 
the line width 
... 
additional arguments to be passed to other functions 
Details
A trajectory is made of successive steps traveled by an organism in the
geographical space. These steps (the line connecting two successive
relocations) can be described by a certain number of descriptive
parameters (relative angles between successive steps, length of the
step, etc.). One aim of the trajectory analysis is to identify
the structure of the trajectory, i.e. the parts of the trajectory where the
steps have homogeneous properties. Indeed, an animal may have a wide
variety of behaviours (feeding, traveling, escape from a predator,
etc.). As a result, partitioning a trajectory occupies a central place in
trajectory analysis.
These functions are to be used to partition a trajectory based on Markov
models of animal movements. For example, one may suppose that a
normal distribution generated the step lengths, with a different mean
for each type of behaviour. These models and the value of their
parameters are supposed a priori by the analyst. These functions
allow, based on these a priori models, to find both the number and the
limits of the segments building up the trajectory (see examples). Any
model can be supposed for any parameter of the steps (the distance,
relative angles, etc.), provided that the model is Markovian.
The rationale behind this algorithm is the following. First, the user
should propose a set of model describing the movements of the animals,
in the different segments of the trajectory. For example, the user may
define two models of normal distribution for the step length, with
means equal to 10 meters (i.e. a trajectory with relatively small steps)
and 100 meters (i.e. a trajectory with longer step lengths). For a
given step of the trajectory, it is possible to compute the probability
density that the step has been generated by each model of the set.
The function modpartltraj
computes the matrix containing the
probability densities associated to each step (rows), under each model
of the set (columns). This matrix is of class modpartltraj
.
Then, the user can estimate the optimal number of segments in the
trajectory, given the set of a priori models, using the function
bestpartmod
, taking as argument the matrix of class
modpartltraj
. If correction = FALSE
, this function
returns the log of the probability (loglikelihood) that the trajectory
is actually made of K
segments, with each one described by one
model. The resulting graph can be used to choose an optimal number of
segment for the partition. Note that Gueguen (2007) noted that this
algorithm tends to overestimate the number of segments in a
trajectory. He proposed to correct this estimation using Monte Carlo
simulations of the independence of the steps within the trajectory. At
each step of the randomization process, the order of the rows of the
matrix is randomized, and the curve of loglikelihood is computed for
each number of segments, for the randomized trajectory. Then, the
observed loglikelihood is corrected by these simulations: for a given
number of segments, the corrected loglikelihood is equal to the
observed loglikelihood minus the simulated loglikelihood. Because
there is a large number of simulations of the independence, a
distribution of corrected loglikelihoods is available for each number
of segments. The "best" number of segments is the one for which the
median of the distribution of corrected loglikelihood is maximum.
Finally, once the optimal number of segments npart
has been
chosen, the function partmod.ltraj
can be used to compute the
partition.
The mathematical rationale underlying these two functions is the
following: given an optimal kpartition of the trajectory, if the ith
step of the trajectory belongs to the segment k predicted by the model d,
then either the relocation (i1) belongs to the same segment, in which
case the segment containing (i1) is predicted by d, or the relocation
(i1) belongs to another segment, and the other (k1) segments
together constitute an optimal (k1) partition of the trajectory
1(i1). These two probabilities are computed recursively by the
functions from the matrix of class partmodltraj
, observing that
the probability of a 1partition of the trajectory from 1 to i described
by the model m (i.e. only one segment describing the trajectory) is
simply the product of the probability densities of the steps from 1 to
i under the model m. Further details can be found in Calenge et
al. (in prep), and in Gueguen (2001, 2007).
Value
partmodltraj
returns a matrix of class partmodltraj
containing the probability densities of the steps of the trajectory
(rows) for each model (columns).
bestpartmod
returns a list with two elements: (i) the element
mk
is a vector containing the values of the logprobabilities
for each number of segments (varying from 1 to Km
), and (ii)
the element correction
contains either "none"
or a
matrix containing the corrected loglikelihood for each number of
segments (rows) and each simulation of the independence (column).
partmod.ltraj
returns a list of class partltraj
with the
following components: ltraj
is an object of class ltraj
containing the segmented trajectory (one burst of relocations per segment
of the partition); stats
is a list containing the following
elements:
locs 
The number ID of the relocations starting the segments (except the last one which ends the last segment) 
Mk 
The value of the cumulative logprobability for the Partition (i.e. the logprobability associated to a Kpartition is equal to the logprobability associated to the (K1)partition plus the logprobability associated to the Kth segment) 
mod 
The number ID of the model chosen for each segment 
which.mod 
the name of the model chosen for each segment 
Author(s)
Clement Calenge clement.calenge@oncfs.gouv.fr
References
Calenge, C., Gueguen, L., Royer, M. and Dray, S. (in prep.) Partitioning the trajectory of an animal with Markov models.
Gueguen, L. (2001) Segmentation by maximal predictive partitioning according to composition biases. Pp 32–44 in: Gascuel, O. and Sagot, M.F. (Eds.), Computational Biology, LNCS, 2066.
Gueguen, L. (in prep.) Computing the probability of sequence segmentation under Markov models.
See Also
ltraj
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62  ## Not run:
## Example on the porpoise
data(porpoise)
## Keep the first porpoise
gus < porpoise[1]
plot(gus)
## First test the independence of the step length
indmove(gus)
## There is a lack of independence between successive distances
## plots the distance according to the date
plotltr(gus, "dist")
## One supposes that the distance has been generated
## by normal distribution, with different means for the
## different behaviours
## The means of the normal distribution range from 0 to
## 130000. We suppose a standard deviation equal to 5000:
tested.means < round(seq(0, 130000, length = 10), 0)
(limod < as.list(paste("dnorm(dist, mean =",
tested.means,
", sd = 5000)")))
## Build the probability matrix
mod < modpartltraj(gus, limod)
## computes the corrected loglikelihood for each
## number of segments
bestpartmod(mod)
## The best number of segments is 4. Compute the partition:
(pm < partmod.ltraj(gus, 4, mod))
plot(pm)
## Shows the partition on the distances:
plotltr(gus, "dist")
lapply(1:length(pm$ltraj), function(i) {
lines(pm$ltraj[[i]]$date, rep(tested.means[pm$stats$mod[i]],
nrow(pm$ltraj[[i]])),
col=c("red","green","blue")[as.numeric(factor(pm$stats$mod))[i]],
lwd=2)
})
## Computes the residuals of the partition
res < unlist(lapply(1:length(pm$ltraj), function(i) {
pm$ltraj[[i]]$dist  rep(tested.means[pm$stats$mod[i]],
nrow(pm$ltraj[[i]]))
}))
plot(res, ty = "l")
## Test of independence of the residuals of the partition:
wawotest(res)
## End(Not run)
