Multi-platform circular binary segmentation with model selection using modified BIC.

Description

This function performs multi-platform circular binary segmentation for detecting DNA copy number changes. It determines the number of change-points in the segmentation using the modified BIC criterion.

Usage

1
mpcbs.mbic(y, pos, anchor, MIN.SNPs = 2, MAX.CHPTS = 30, platform.names, plots = TRUE)

Arguments

y

A vector of intensity levels for each platform. y[[k]] should be the intensity levels corresponding to pos[[k]] for platform k.

pos

A vector of sorted integer arrays, one array for each platform. pos[[k]] should give the positions, in increasing order, of the probes of the k-th platform.

anchor

The anchor set, returned by a call to merge.pos(...)

MIN.SNPs

The minimum number of SNPs for each platform in a segment.

MIN.BP.LEN

The minimum base pair length of a segment.

rratio

An array containing the signal response ratio of each platform.

MAX.CHPTS

The maximum number of change-points to try.

platform.names

The names of the platforms.

plots

A logical indicating whether to make progress plots.

plotspdf

An optional pdf file where the progress plots will be recorded.

use.filtered.scan

A logical indicating whether to use the faster filtered scan statistic (highly recommended).

Details

MIN.SNPs is the minimum number of snps that a platform needs to have in a window to contribute to the combined scan statistic. If the platform does not have enough SNPs, it will simply contribute 0 to the overall statistic, but if there is enough evidence from the other platforms, that window may still be called.

MIN.BP.LEN is the minimum base pair length of any call.

The modified BIC for all segments containing 1 to MAX.CHPTS change-points will be computed. The returned segmentation is the one that maximizes the modified BIC.

If the rratio is not specified, it is assumed to be 1 for all platforms.

If use.filtered.scan is TRUE, the scan will roughly take O[nlog(n)] time. If this is false, the scan will take O[n^2] time.

Value

yhat

A vector of fitted y values, yhat[[k]] contains the fitted values to y[[k]].

chpts

A list of change-points, there are shared across platforms.

segmat

A two-column array containing the start and end points of each segment.

chpt.hist

A vector, where chpt.hist[[k]] is the best segmentation for the k-th split, the next split to be made, and its Z-score.

mbic

The modified BIC computed for the best segmentation containing 1 to MAX.CHPTS changepoints.

...

Author(s)

Nancy R. Zhang

References

Zhang, NR, Senbabaoglu, Y. and Li, J.Z. (2009) Joint Estimation of DNA Copy Number from Multiple Platforms. Under review, download manuscript from http://www-stat.stanford.edu/~nzhang/web_multiplatform/

See Also

mpcbs, plot.crossplatform, merge.pos

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
data(mpcbs.example)

# there are 3 platforms represented in this data example: Illumina, Affymetrix, and Agilent.
names(mpcbs.example)

# K is the number of platforms.
K=3

# Store the chromosome positions in vector pos,
# the intensities in vector y:

pos=vector("list",K)
pos[[1]] = mpcbs.example$illu[,1]
pos[[2]] = mpcbs.example$affy[,1]
pos[[3]] = mpcbs.example$agil[,1]

y = vector("list",K)
y[[1]] = mpcbs.example$illu[,2]
y[[2]] = mpcbs.example$affy[,2]
y[[3]] = mpcbs.example$agil[,2]

# Names of the platforms:
platform.names=c("Illumina","Affymetrix","Agilent")

# Get the anchor set.
anchor = merge.pos(pos)

# Perform the segmentation.
seg<-mpcbs.mbic(y,pos,anchor, MAX.CHPTS=10, platform.names=platform.names,plots=TRUE)


# Plot the data and segmentation, giving it yhat and change-points.
plot.crossplatform(pos,y, yhat=seg$yhat, chpts=seg$chpts, anchor=anchor, platform.names=platform.names, col="darkgray")