lavielle: Segmentation of a time series using the method of Lavielle... In adehabitatLT: Analysis of Animal Movements

 lavielle R Documentation

Segmentation of a time series using the method of Lavielle (1999, 2005)

Description

These functions allow to perform a non-parametric segmentation of a time series using the penalized contrast method of Lavielle (1999, 2005). The function `lavielle` computes the contrast matrix (i.e., the matrix used to segment the series) either from a series of observations or from an animal trajectory. The function `chooseseg` can be used to estimate the number of segments building up the trajectory. The function `findpath` can be used to find the limits of the segments (see Details).

Usage

``````lavielle(x, ...)

## Default S3 method:
lavielle(x, Lmin, Kmax, ld = 1,
type = c("mean", "var", "meanvar"), ...)

## S3 method for class 'ltraj'
lavielle(x, Lmin, Kmax, ld = 1, which = "dist",
type = c("mean", "var", "meanvar"), ...)

## S3 method for class 'lavielle'
print(x, ...)

chooseseg(lav, S = 0.75, output = c("full","opt"),
draw = TRUE)

findpath(lav, K, plotit = TRUE)

``````

Arguments

 `x` for `lavielle.default`, a vector containing the successive observations building up the series. For `lavielle.ltraj`, an object of class `ltraj`. `Lmin` an integer value indicating the minimum number of observations in each segment. Should be a multiple of `ld`. `Kmax` an integer value indicating the maximum number of segments expected in the series `ld` an integer value indicating the resolution for the calculation of the contrast function. The contrast function will be evaluated for segments containing the observations `c(1:ld)`, `c(1:(2*ld))`, `c(1:(3*ld))`, and all segments will necessarily contain a multiple of `ld` observations. Note that `ld` should be set to values greater than 1 if memory problem occur `type` the type of contrast function to be used to segment the series (see Details) `which` a character string giving any syntactically correct R expression implying the descriptive elements in `x` or the variables in the optional attribute `infolocs`. `lav` an object of class `"lavielle"` `S` a value indicating the threshold in the second derivative of the contrast function `output` type of output expected (see the section value) `draw` a logical value indicating whether the decrease in the contrast function should be plotted `K` The number of segments `plotit` a logical value indicating whether the segmentation should be plotted `...` additional arguments to be passed from or to other functions

Details

The method of Lavielle (1999, 2005) per se finds the best segmentation of a time series, given that it is built by `K` segments. It searches the segmentation for which a contrast function (measuring the contrast between the actual series and the segmented series) is minimized. Different contrast functions are available measuring different aspects of the variation of the series from one segment to the next: when `type = "mean"`, we suppose that only the mean of the segments varies between segments; when ```type = "var"```, we suppose that only the variance of the segments varies between segments; when `type = "meanvar"`, we suppose that both the mean and the variance varies between segments. It is required to specify a value for the minimum number of observations `Lmin` in a segment, as well as the maximum number of segments `Kmax` in the series.

There are several approaches to estimate the best number of segments `K` to partition the time series. One possible approach is the graphical examination of the decrease of the contrast function with the number of segments. In theory, there should be a clear "break" in the decrease of this function after the optimal value of `K`. Lavielle (2005) suggested an alternative way to estimate automatically the optimal number of segments, also relying on the presence of a "break" in the decrease of the contrast function. He proposed to choose the last value of `K` for which the second derivative of a standardized constrast function is greater than a threshold `S` (see Lavielle, 2005 for details). Based on numerical experiments, he proposed to choose the value ```S = 0.75```. Note, however, that for short time series (i.e. less than 500 observations) some simulations indicated that this value may not be optimal and may depend on the value of `Kmax`, so that the graphical method is maybe more appropriate.

Value

The function `lavielle.default` returns a list of class `lavielle`, with an attribute `"typeseg"` set to `"default"`. This list contains the following elements:

 `contmat` The contrast matrix `sumcont` The optimal contrast `matpath` The matrix of the paths from the first to the last observation `Kmax` The maximum number of segments `Lmin` The minimum number of observations in a segment `ld` the value of the resolution `ld` `series` The time series

The function `lavielle.ltraj` also returns a list of class `lavielle`, with an attribute `"typeseg"` set to `"ltraj"`.

The function `chooseseg` returns the optimal number of segments when `output = "opt"`, and a dataframe containing the value of the contrast function `Jk` and of the second derivative `D` of the standardized contrast function for each possible value of `K`, if `output = "full"`.

The function `findpath` return a list containing vectors giving the index of the first and last observations in each segment, when the object of class `"lavielle"` passed as argument is characterized by an attribute `"typeseg"` set to `"default"`. When the attribute `"typeseg"` is set to `"ltraj"`, this function returns an object of class ltraj where each burst correspond to a segment.

Note

The contrast matrix is a matrix of size `n*n` (with `n` the number of observations in the series). If `n` is large, memory problems may occur. In this case, setting `ld` to a value greater than one will allow to reduce the size of this matrix (i.e. it will be of size `k*k`, where `k = floor(n/ld)`). However, this will also reduce the resolution of the segmentation, so that the segment limits will be less precisely estimated.

Author(s)

Clement Calenge clement.calenge@ofb.gouv.fr. The code is a C translation based on the Matlab code of M. Lavielle

References

Lavielle, M. (1999) Detection of multiple changes in a sequence of dependent variables. Stochastic Processes and their Applications, 83: 79–102.

Lavielle, M. (2005) Using penalized contrasts for the change-point problem. Report number 5339, Institut national de recherche en informatique et en automatique.

Examples

``````
#################################################
##
## A simulated series

suppressWarnings(RNGversion("3.5.0"))
set.seed(129)
seri <- c(rnorm(100), rnorm(100, mean=2),
rnorm(100), rnorm(100, mean=-3),
rnorm(100), rnorm(100, mean=2))
plot(seri, ty="l", xlab="time", ylab="Series")

## Segmentation:
(l <- lavielle(seri, Lmin=10, Kmax=20))

## choose the number of segments
chooseseg(l)

## There is a clear break in the
## decrease of the contrast function after K = 6
## Moreover, Jk(6) >> 0.75 and Jk(7) << 0.75
## We choose 6 segments:
fp <- findpath(l, 6)
fp

## This list gives the limits of the segments
## for example, to get the first segment:
seg <- 1
firstseg <- seri[fp[[seg]][1]:fp[[seg]][2]]

####################################################
##
## Now, changes of variance

## A simulated series
suppressWarnings(RNGversion("3.5.0"))
set.seed(129)
seri <- c(rnorm(100), rnorm(100, sd=2),
rnorm(100), rnorm(100, sd=3),
rnorm(100), rnorm(100, sd=2))
plot(seri, ty="l", xlab="time", ylab="Series")

## Segmentation:
(l <- lavielle(seri, Lmin=10, Kmax=20, type="var"))

## choose the number of segments
chooseseg(l)

## There is a clear break in the
## decrease of the contrast function after K = 6
## Moreover, Jk(6) >> 0.75 and Jk(7) << 0.75
## We choose 6 segments:
fp <- findpath(l, 6)
fp

## This list gives the limits of the segments
## for example, to get the first segment:
seg <- 1
firstseg <- seri[fp[[seg]][1]:fp[[seg]][1]]

#################################################
##
## Example of segmentation of a trajectory

## Show the trajectory
data(porpoise)
gus <- porpoise[1]
plot(gus)

## Show the changes in the distance between
## successive relocations with the time
plotltr(gus, "dist")

## Segmentation of the trajectory based on these distances
lav <- lavielle(gus, Lmin=2, Kmax=20)

## Choose the number of segments
chooseseg(lav)
## 4 segments seem a good choice

## Show the partition
kk <- findpath(lav, 4)
plot(kk)

``````

adehabitatLT documentation built on Sept. 11, 2024, 7:15 p.m.