SBL: Sparse Bayesian Learning (SBL) Segmentation Algorithm
In gada: Genome Alteration Detection Algorithm (GADA)

Description Usage Arguments Details Value References See Also Examples

Fits the a sparse Bayesian learning (SBL) model on a single sample setupGADA object

1	SBL(x, sigma2, aAlpha = 0.2, estim.sigma2 = FALSE, maxit = 10000, tol = 1e-08, verbose = FALSE, saveInfo = TRUE)

`x`	an object of class 'setupGADA' prepared using 'setupGADAgeneral', 'setupGADAaffy', or 'setupGADAIllumina'
`sigma2`	the array noise level. See details
`aAlpha`	sparseness hyperparameter. See details
`estim.sigma2`	Should the 'sigma2' be estimated? the default is FALSE
`maxit`	maximum number of iterations in the SBL algorithm
`tol`	tolerance criteria to stop the SBL algorithm
`verbose`	print verbose information, the default is FALSE (useful to debug errors)
`saveInfo`	TRUE if the annotation data is transfered to the retuned SBL object. The default is TRUE

This function fits a SBL model on a single DNA array observation. The underlying copy number is assumed assumed to be piecewise constant (PWC) with a sparse number of breakpoints but the observed array is degraded by noise.

The array noise level can be provided by the 'sigma2' parameter or if it is unkown we can use 'estim.sigma2=TRUE' to estimate the hybridization noise level.

The SBL model places an hierarchical Bayesian prior over the breakpoints delimiting the probes that fall into a copy number a piece-wise constant (PWC) vector. This Bayesian prior is uninformative about the magnitude and position of the breakpoints but enforces sparseness (i.e., assumes that only very few breakpoints are true positives). The hyperparameter 'aAlpha' is used to control the sparseness level. Instead of adjusting we recomment to use a high sensitivity value (e.g. 'aAlpha=0.2', default value) and adjust the False Discovery Rate (FDR) using the 'BackwardElimination' procedure. The 'aAlpha=0.2' can be increased to obtain a faster result which may be interesting in case of very high density arrays, but we may not be able to recover some of the segments that would be otherwise obtained with a high sensitivity setting with the BackwardElimination procedure.

The SBL model is fit using an expectation maximization (EM) algorithm. The 'tol' parameter sets the maximum allowed change on the model parameters to consider that the algorithm has converged. The 'maxit' parameter establishes a maximum number of iterations. If the algorithm does not converges before 'maxit' number of iterations have been computed a warning is returned with the last change in magnitude.

An object of class 'SBL' to be used with the 'BackwardElimination' algorithm.

'print' returns the number of discontinuities or segments by chromosome and convergence information about the SBL algorithm.

Pique-Regi R, Caceres A, Gonzalez JR. "R-Gada: a package for fast detection and visualization of copy number alterations on multiple samples", BMC Bioinformatics , Submitted Nov 2009

Pique-Regi R, Monso-Varona J,Ortega A, Seeger RC, Triche TJ, Asgharzadeh S. "Sparse representation and Bayesian detection of the genome copy number alterations from microarray data", Bioinformatics , Feb 2008

setupGADAIllumina, setupGADAaffy, setupGADAgeneral, BackwardElimination, parSBL, parBE

## Not run: 
# import data
download.file("http://www.creal.cat/jrgonzalez/GADA/dataIllumina.txt","dataIllumina.txt")

# creating object of class setupGADA
dataIllumina<-setupGADAIllumina(file="dataIllumina.txt", log2ratioCol=5, NumCols=6)

# Segmentation procedure
step1<-SBL(dataIllumina, estim.sigma2=TRUE)

# print
step1


## End(Not run)