fastseg: Detection of breakpoints using a fast segmentation algorithm...
In fastseg: fastseg - a fast segmentation algorithm

Description Usage Arguments Value Author(s) Examples

View source: R/fastseg.R

Detection of breakpoints using a fast segmentation algorithm based on the cyber t-test.

1
2
3

  fastseg(x, type = 1, alpha = 0.05, segMedianT, minSeg = 4,
    eps = 0, delta = 5, maxInt = 40, squashing = 0,
    cyberWeight = 10)

`x`	Values to be segmented either in the format of a sorted GRanges object, ExpressionSet object, matrix or vector.
`type`	Parameter that sets the type of test. If set to 1 a test of the left against the right window is performend. If set to 2 the segment is also tested against the global mean. (Default = 1).
`alpha`	A value between 0 and 1 is interpreted as the ratio of initial breakpoints. An integer greater than one is interpreted as number of desired breakpoints. Increasing this parameter leads to more segments. (Default = 0.1)
`segMedianT`	A numeric vector of length two with the thresholds of segments' median values that are considered as significant. Only segments with a median above the first or below the second value are kept in a final merging step. If missing the algorithm will try to find a reasonable value by using z-scores. (Default "missing".)
`minSeg`	The minimal segment length. (Default = 4).
`eps`	Minimal distance between consecutive values. Only consecutive values with a minimium distance of "eps" are tested. This makes the segmentation algorithm even faster. If all values should be tested "eps" can be set to zero. If missing the algorithm will try to find a reasonable value by using quantiles. (Default = 0.)
`delta`	Segment extension parameter. If delta consecutive extensions of the left and the right segment do not lead to a better p-value the testing is stopped. (Default = 5).
`maxInt`	Maximal length of the left and the right segment. (Default = 40).
`squashing`	The degree of squashing of the input values. If set to zero no squashing is performed. (Default = 0).
`cyberWeight`	The nu parameter of the cyber t-test. Can be interpreted as the weight of the global variance. The higher the value the more small segments with high variance will be significant. (Default = 10).
`...`	Further arguments passed to the plot function.

A data frame containing the segments.

Guenter Klambauer klambauer@bioinf.jku.at

library(fastseg)

#####################################################################
### the data
#####################################################################
data(coriell)
head(coriell)

samplenames <- colnames(coriell)[4:5]
data <- as.matrix(coriell[4:5])
data[is.na(data)] <- median(data, na.rm=TRUE)
chrom <- coriell$Chromosome
maploc <- coriell$Position


###########################################################
## GRanges
###########################################################

library("GenomicRanges")

## with both individuals
gr <- GRanges(seqnames=chrom,
        ranges=IRanges(maploc, end=maploc))
mcols(gr) <- data
colnames(mcols(gr)) <- samplenames
res <- fastseg(gr)

## with one individual
gr2 <- gr
data2 <- as.matrix(data[, 1])
colnames(data2) <- "sample1"
mcols(gr2) <- data2
res <- fastseg(gr2)


###########################################################
## vector
###########################################################
data2 <- data[, 1]
res <- fastseg(data2)



###########################################################
## matrix
###########################################################
data2 <- data[1:400, ]
res <- fastseg(data2)

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colMeans, colSums, colnames, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which,
    which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

        Clone Chromosome Position Coriell.05296 Coriell.13330
1  GS1-232B23          1        1      0.000359      0.207470
2  RP11-82d16          1      469      0.008824      0.063076
3  RP11-62m23          1     2242     -0.000890      0.123881
4  RP11-60j11          1     4505      0.075875      0.154343
5 RP11-111O05          1     5441      0.017303     -0.043890
6  RP11-51b04          1     7001     -0.006770      0.094144