Description Usage Arguments Value Author(s) References See Also Examples
A model selection procedure is applied after CBS segmentation. In another word, we assess which ones in over-detected change points from CBS calls are really necessary. More specifically, we used $K$ change points as $K$ predictors for input $X_i, i = (0,..., n)$ to fit a linear model and select variables by step-wise regression implemented in lars()(from R package lars). Then optimal change points could be selected from the LARS solution path via different criterions.
1 |
data |
A GRanges object, output of SomatiCAFormat(). |
selection |
Model selection parameters. |
collapse.k |
Number of data points collapsed. |
ncores |
Number of cores used. |
verbose |
Whether working messages are shown. |
variation.control |
A logical value, whether pseudo points are used to smooth the segment. Default is TRUE. |
rss |
A logical value, whether a cutoff based on residue sum of squares is used. Default is FALSE. |
S |
The cutoff based on residue sum of squares. Default is 0.1. |
k |
The window size used to smooth the outliers. |
... |
Arguments for segment() in DNAcopy package. |
segment |
S4 class, "Segmented". |
hetsites |
Heterozygous sites used in segmentation, unsmoothed. |
Mengjie Chen
Efron, Hastie, Johnstone and Tibshirani (2003) "Least Angle Regression" (with discussion) Annals of Statistics. Olshen, A. B., Venkatraman, E. S., Lucito, R., Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5: 557-572. Venkatraman, E. S., Olshen, A. B. (2007) A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23: 657-63.
See Also SomatiCAFormat
, lars
, segment
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | rawLAF <- c(rnorm(300, 0.2, 0.05), rnorm(300, 0.4, 0.05), rnorm(200, 0.3, 0.05), rnorm(200, 0.2, 0.05), rnorm(200, 0.3, 0.05), rnorm(250, 0.4, 0.05))
rawLAF <- ifelse(rawLAF>0.5, 1-rawLAF, rawLAF)
germLAF <- c(rnorm(800+650, 0.4, 0.05))
germLAF <- ifelse(germLAF>0.5, 1-germLAF, germLAF)
reads1 <- c(rpois(300, 25), rpois(300, 50), rpois(200, 60), rpois(200, 25), rpois(200, 40), rpois(250, 50))
reads2 <- rpois(800+650,50)
chr <- c(rep("chr1", 800), rep("chr2", 650))
position <- c(c(1:800), c(1:650))
zygo <- rep("het", 800+650)
x <- data.frame(chr, as.integer(position), as.character(zygo), as.integer(reads1), rawLAF, as.integer(reads2), germLAF)
colnames(x) <- c("seqnames", "start", "zygosity", "tCount", "LAF", "tCountN", "germLAF")
data <- SomatiCAFormat(x)
### This is an easy example, without much noise.
### Consider to use rss=T to select change points from sequencing data
seg <- larsCBSsegment(data, rss = FALSE)
plotSegment(seg$segment, data, k = 1, smooth = FALSE)
plotSegment(seg$segment, data, k = 2, smooth = FALSE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.