SBL: Sparse Bayesian Learning (SBL) Segmentation Algorithm

Description Usage Arguments Details Value References See Also Examples

Description

Fits the a sparse Bayesian learning (SBL) model on a single sample setupGADA object

Usage

1
SBL(x, sigma2, aAlpha = 0.2, estim.sigma2 = FALSE, maxit = 10000, tol = 1e-08, verbose = FALSE, saveInfo = TRUE)

Arguments

x

an object of class 'setupGADA' prepared using 'setupGADAgeneral', 'setupGADAaffy', or 'setupGADAIllumina'

sigma2

the array noise level. See details

aAlpha

sparseness hyperparameter. See details

estim.sigma2

Should the 'sigma2' be estimated? the default is FALSE

maxit

maximum number of iterations in the SBL algorithm

tol

tolerance criteria to stop the SBL algorithm

verbose

print verbose information, the default is FALSE (useful to debug errors)

saveInfo

TRUE if the annotation data is transfered to the retuned SBL object. The default is TRUE

Details

This function fits a SBL model on a single DNA array observation. The underlying copy number is assumed assumed to be piecewise constant (PWC) with a sparse number of breakpoints but the observed array is degraded by noise.

The array noise level can be provided by the 'sigma2' parameter or if it is unkown we can use 'estim.sigma2=TRUE' to estimate the hybridization noise level.

The SBL model places an hierarchical Bayesian prior over the breakpoints delimiting the probes that fall into a copy number a piece-wise constant (PWC) vector. This Bayesian prior is uninformative about the magnitude and position of the breakpoints but enforces sparseness (i.e., assumes that only very few breakpoints are true positives). The hyperparameter 'aAlpha' is used to control the sparseness level. Instead of adjusting we recomment to use a high sensitivity value (e.g. 'aAlpha=0.2', default value) and adjust the False Discovery Rate (FDR) using the 'BackwardElimination' procedure. The 'aAlpha=0.2' can be increased to obtain a faster result which may be interesting in case of very high density arrays, but we may not be able to recover some of the segments that would be otherwise obtained with a high sensitivity setting with the BackwardElimination procedure.

The SBL model is fit using an expectation maximization (EM) algorithm. The 'tol' parameter sets the maximum allowed change on the model parameters to consider that the algorithm has converged. The 'maxit' parameter establishes a maximum number of iterations. If the algorithm does not converges before 'maxit' number of iterations have been computed a warning is returned with the last change in magnitude.

Value

An object of class 'SBL' to be used with the 'BackwardElimination' algorithm.

'print' returns the number of discontinuities or segments by chromosome and convergence information about the SBL algorithm.

References

Pique-Regi R, Caceres A, Gonzalez JR. "R-Gada: a package for fast detection and visualization of copy number alterations on multiple samples", BMC Bioinformatics , Submitted Nov 2009

Pique-Regi R, Monso-Varona J,Ortega A, Seeger RC, Triche TJ, Asgharzadeh S. "Sparse representation and Bayesian detection of the genome copy number alterations from microarray data", Bioinformatics , Feb 2008

See Also

setupGADAIllumina, setupGADAaffy, setupGADAgeneral, BackwardElimination, parSBL, parBE

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
# import data
download.file("http://www.creal.cat/jrgonzalez/GADA/dataIllumina.txt","dataIllumina.txt")

# creating object of class setupGADA
dataIllumina<-setupGADAIllumina(file="dataIllumina.txt", log2ratioCol=5, NumCols=6)

# Segmentation procedure
step1<-SBL(dataIllumina, estim.sigma2=TRUE)

# print
step1


## End(Not run)

gada documentation built on May 2, 2019, 6:10 p.m.