# jointSeg: Joint segmentation of multivariate signals In jointseg: Joint segmentation of multivariate (copy number) signals

## Description

Joint segmentation of multivariate signals in two steps:

1. first-pass segmentation. By default, a fast, greedy approach is used (see `method`).

2. pruning of the candidate change points obtained by dynamic programming

## Usage

 ```1 2 3 4 5 6``` ```jointSeg(Y, method = "RBS", stat = NULL, dpStat = stat, segFUN = NULL, jitter = NULL, modelSelectionMethod = ifelse(match(method, c("DynamicProgramming", "RBS", "GFLars"), nomatch = 0) > 0, "Lebarbier", "none"), modelSelectionOnDP = (match(method, c("DynamicProgramming", "RBS", "GFLars"), nomatch = 0) > 0), ..., profile = FALSE, verbose = FALSE) ```

## Arguments

 `Y` The signal to be segmented (a matrix or a numeric vector) `method` A `character` value, the type of segmentation method used. May be one of: "RBS"Recursive Binary Segmentation (the default), see `segmentByRBS` as described in Gey and Lebarbier (2005) "GFLars"Group fused LARS as described in Bleakley and Vert (2011). "DP"Dynamic Programming (Bellman, 1956). For univariate signals the pruned DP of Rigaill et al (2010) is used. "other"The segmentation method is passed as a function using argument `segFUN` (see examples in directory `otherMethods` of the `jointseg` package). `stat` A vector containing the names or indices of the columns of `Y` to be segmented `dpStat` A vector containing the names or indices of the columns of `Y` to be segmented at the second round of segmentation. Defaults to the value of argument `stat`. `segFUN` The segmentation function to be used when `method` is set to `other`. Not used otherwise. `jitter` Uncertainty on breakpoint position after initial segmentation. Defaults to `NULL`. See Details. `modelSelectionMethod` Which method is used to perform model selection. `modelSelectionOnDP` If `TRUE` (the default), model selection is performed on segmentation after dynamic programming; else model selection is performed on initial segmentation. Only applies to methods "DP", "RBS" and "GFLars". `...` Further arguments to be passed to the lower-level segmentation method determined by argument `method`. `profile` Trace time and memory usage ? `verbose` A `logical` value: should extra information be output ? Defaults to `FALSE`.

## Details

If the return value of the initial segmentation has an element named `dpseg`, then initial segmentation results are not pruned by dynamic programming.

If `jitter` is not `NULL`, it should be a vector of integer indices. The set of candidate breakpoints passed on to dynamic programming is augmented by all indices distant from an element of `jitter` from one of the candidates. For example, if `jitter==c(-1, 0, 1)` and the initial set of breakpoints is `c(1,5)` then dynamic programming is run on `c(1,2,4,5,6)`.

If `modelSelectionOnDP` is set to `FALSE`, then model selection is run on the sets of the form `bkp[1:k]` for 1 ≤q k ≤q length(bkp), where `bkp` is the set of breakpoints identified by the initial segmentation. In particular, this implies that the candidate breakpoints in `bkp` are sorted by order of appearance and not by position.

## Value

A list with elements:

 `bestBkp` Best set of breakpoints after dynamic programming `initBkp` Results of the initial segmentation, using 'doNnn', where 'Nnn' corresponds to argument `method` `dpBkpList` Results of dynamic programming, a list of vectors of breakpoint positions for the best model with k breakpoints for k=1, 2, ... K where `K=length(initBkp)` `prof` a `matrix` providing time usage (in seconds) and memory usage (in Mb) for the main steps of the program. Only defined if argument `profile` is set to `TRUE`

## Author(s)

Morgane Pierre-Jean and Pierre Neuvial

## References

Bleakley, K., & Vert, J. P. (2011). The group fused lasso for multiple change-point detection. arXiv preprint arXiv:1106.4199.

Vert, J. P., & Bleakley, K. (2010). Fast detection of multiple change-points shared by many signals using group LARS. Advances in Neural Information Processing Systems, 23, 2343-2351.

Gey, S., & Lebarbier, E. (2008). Using CART to Detect Multiple Change Points in the Mean for Large Sample. http://hal.archives-ouvertes.fr/hal-00327146/

Rigaill, G. (2010). Pruned dynamic programming for optimal multiple change-point detection. arXiv preprint arXiv:1004.0887.

Bellman, R. (1956). Dynamic programming and Lagrange multipliers. Proceedings of the National Academy of Sciences of the United States of America, 42(10), 767.

`pruneByDP`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19``` ```## A two-dimensional signal p <- 2 trueK <- 10 len <- 1e4 sim <- randomProfile(len, trueK, 1, p) Y <- sim\$profile K <- 2*trueK res <- jointSeg(Y, method="RBS", K=K) bkp <- res\$bestBkp getTpFp(bkp, sim\$bkp, tol=5, relax = -1) ## true and false positives plotSeg(Y, list(sim\$bkp, res\$bestBkp), col=1) ## Now we add some NA:s in one dimension jj <- sim\$bkp[1] Y[jj-seq(-10,10), p] <- NA res2 <- jointSeg(Y, method="RBS", K=K, verbose=TRUE) bkp <- res2\$bestBkp getTpFp(res2\$bestBkp, sim\$bkp, tol=5, relax = -1) ## true and false positives ```