lavielle: Segmentation of a time series using the method of Lavielle...

View source: R/lavielle.r

lavielleR Documentation

Segmentation of a time series using the method of Lavielle (1999, 2005)

Description

These functions allow to perform a non-parametric segmentation of a time series using the penalized contrast method of Lavielle (1999, 2005). The function lavielle computes the contrast matrix (i.e., the matrix used to segment the series) either from a series of observations or from an animal trajectory. The function chooseseg can be used to estimate the number of segments building up the trajectory. The function findpath can be used to find the limits of the segments (see Details).

Usage

lavielle(x, ...)

## Default S3 method:
lavielle(x, Lmin, Kmax, ld = 1,
                           type = c("mean", "var", "meanvar"), ...) 

## S3 method for class 'ltraj'
lavielle(x, Lmin, Kmax, ld = 1, which = "dist",
                         type = c("mean", "var", "meanvar"), ...) 

## S3 method for class 'lavielle'
print(x, ...)

chooseseg(lav, S = 0.75, output = c("full","opt"),
          draw = TRUE)

findpath(lav, K, plotit = TRUE)

Arguments

x

for lavielle.default, a vector containing the successive observations building up the series. For lavielle.ltraj, an object of class ltraj.

Lmin

an integer value indicating the minimum number of observations in each segment. Should be a multiple of ld.

Kmax

an integer value indicating the maximum number of segments expected in the series

ld

an integer value indicating the resolution for the calculation of the contrast function. The contrast function will be evaluated for segments containing the observations c(1:ld), c(1:(2*ld)), c(1:(3*ld)), and all segments will necessarily contain a multiple of ld observations. Note that ld should be set to values greater than 1 if memory problem occur

type

the type of contrast function to be used to segment the series (see Details)

which

a character string giving any syntactically correct R expression implying the descriptive elements in x or the variables in the optional attribute infolocs.

lav

an object of class "lavielle"

S

a value indicating the threshold in the second derivative of the contrast function

output

type of output expected (see the section value)

draw

a logical value indicating whether the decrease in the contrast function should be plotted

K

The number of segments

plotit

a logical value indicating whether the segmentation should be plotted

...

additional arguments to be passed from or to other functions

Details

The method of Lavielle (1999, 2005) per se finds the best segmentation of a time series, given that it is built by K segments. It searches the segmentation for which a contrast function (measuring the contrast between the actual series and the segmented series) is minimized. Different contrast functions are available measuring different aspects of the variation of the series from one segment to the next: when type = "mean", we suppose that only the mean of the segments varies between segments; when type = "var", we suppose that only the variance of the segments varies between segments; when type = "meanvar", we suppose that both the mean and the variance varies between segments. It is required to specify a value for the minimum number of observations Lmin in a segment, as well as the maximum number of segments Kmax in the series.

There are several approaches to estimate the best number of segments K to partition the time series. One possible approach is the graphical examination of the decrease of the contrast function with the number of segments. In theory, there should be a clear "break" in the decrease of this function after the optimal value of K. Lavielle (2005) suggested an alternative way to estimate automatically the optimal number of segments, also relying on the presence of a "break" in the decrease of the contrast function. He proposed to choose the last value of K for which the second derivative of a standardized constrast function is greater than a threshold S (see Lavielle, 2005 for details). Based on numerical experiments, he proposed to choose the value S = 0.75. Note, however, that for short time series (i.e. less than 500 observations) some simulations indicated that this value may not be optimal and may depend on the value of Kmax, so that the graphical method is maybe more appropriate.

Value

The function lavielle.default returns a list of class lavielle, with an attribute "typeseg" set to "default". This list contains the following elements:

contmat

The contrast matrix

sumcont

The optimal contrast

matpath

The matrix of the paths from the first to the last observation

Kmax

The maximum number of segments

Lmin

The minimum number of observations in a segment

ld

the value of the resolution ld

series

The time series

The function lavielle.ltraj also returns a list of class lavielle, with an attribute "typeseg" set to "ltraj".

The function chooseseg returns the optimal number of segments when output = "opt", and a dataframe containing the value of the contrast function Jk and of the second derivative D of the standardized contrast function for each possible value of K, if output = "full".

The function findpath return a list containing vectors giving the index of the first and last observations in each segment, when the object of class "lavielle" passed as argument is characterized by an attribute "typeseg" set to "default". When the attribute "typeseg" is set to "ltraj", this function returns an object of class ltraj where each burst correspond to a segment.

Note

The contrast matrix is a matrix of size n*n (with n the number of observations in the series). If n is large, memory problems may occur. In this case, setting ld to a value greater than one will allow to reduce the size of this matrix (i.e. it will be of size k*k, where k = floor(n/ld)). However, this will also reduce the resolution of the segmentation, so that the segment limits will be less precisely estimated.

Author(s)

Clement Calenge clement.calenge@ofb.gouv.fr. The code is a C translation based on the Matlab code of M. Lavielle

References

Lavielle, M. (1999) Detection of multiple changes in a sequence of dependent variables. Stochastic Processes and their Applications, 83: 79–102.

Lavielle, M. (2005) Using penalized contrasts for the change-point problem. Report number 5339, Institut national de recherche en informatique et en automatique.

Examples


#################################################
##
## A simulated series

suppressWarnings(RNGversion("3.5.0"))
set.seed(129)
seri <- c(rnorm(100), rnorm(100, mean=2),
          rnorm(100), rnorm(100, mean=-3),
          rnorm(100), rnorm(100, mean=2))
plot(seri, ty="l", xlab="time", ylab="Series")

## Segmentation:
(l <- lavielle(seri, Lmin=10, Kmax=20))

## choose the number of segments
chooseseg(l)

## There is a clear break in the
## decrease of the contrast function after K = 6
## Moreover, Jk(6) >> 0.75 and Jk(7) << 0.75
## We choose 6 segments:
fp <- findpath(l, 6)
fp

## This list gives the limits of the segments
## for example, to get the first segment:
seg <- 1
firstseg <- seri[fp[[seg]][1]:fp[[seg]][2]]

####################################################
##
## Now, changes of variance

## A simulated series
suppressWarnings(RNGversion("3.5.0"))
set.seed(129)
seri <- c(rnorm(100), rnorm(100, sd=2),
          rnorm(100), rnorm(100, sd=3),
          rnorm(100), rnorm(100, sd=2))
plot(seri, ty="l", xlab="time", ylab="Series")

## Segmentation:
(l <- lavielle(seri, Lmin=10, Kmax=20, type="var"))

## choose the number of segments
chooseseg(l)

## There is a clear break in the
## decrease of the contrast function after K = 6
## Moreover, Jk(6) >> 0.75 and Jk(7) << 0.75
## We choose 6 segments:
fp <- findpath(l, 6)
fp

## This list gives the limits of the segments
## for example, to get the first segment:
seg <- 1
firstseg <- seri[fp[[seg]][1]:fp[[seg]][1]]

#################################################
##
## Example of segmentation of a trajectory

## Show the trajectory
data(porpoise)
gus <- porpoise[1]
plot(gus)

## Show the changes in the distance between
## successive relocations with the time
plotltr(gus, "dist")

## Segmentation of the trajectory based on these distances
lav <- lavielle(gus, Lmin=2, Kmax=20)

## Choose the number of segments
chooseseg(lav)
## 4 segments seem a good choice

## Show the partition
kk <- findpath(lav, 4)
plot(kk)


adehabitatLT documentation built on Sept. 11, 2024, 7:15 p.m.