Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function allows the detection of breakpoints in genomic profiles obtained by array CGH technology and affects a status (gain, normal or lost) to each clone.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## S3 method for class 'profileCGH'
daglad(profileCGH, mediancenter=FALSE,
normalrefcenter=FALSE, genomestep=FALSE,
OnlySmoothing = FALSE, OnlyOptimCall = FALSE,
smoothfunc="lawsglad", lkern="Exponential",
model="Gaussian", qlambda=0.999, bandwidth=10,
sigma=NULL, base=FALSE, round=2,
lambdabreak=8, lambdaclusterGen=40, param=c(d=6),
alpha=0.001, msize=2, method="centroid", nmin=1, nmax=8, region.size=2,
amplicon=1, deletion=-5, deltaN=0.10, forceGL=c(-0.15,0.15),
nbsigma=3, MinBkpWeight=0.35, DelBkpInAmp=TRUE, DelBkpInDel=TRUE,
CheckBkpPos=TRUE, assignGNLOut=TRUE,
breaksFdrQ = 0.0001, haarStartLevel = 1,
haarEndLevel = 5, weights.name = NULL,
verbose=FALSE, ...)
|
profileCGH |
Object of class |
mediancenter |
If |
genomestep |
If |
normalrefcenter |
If |
OnlySmoothing |
If |
OnlyOptimCall |
If |
smoothfunc |
Type of algorithm used to smooth |
lkern |
lkern determines the location kernel to be used
(see |
model |
model determines the distribution type of LogRatio
(see |
qlambda |
qlambda determines the scale parameter qlambda for the
stochastic penalty (see |
base |
If TRUE, the position of clone is the physical position onto the chromosome, otherwise the rank position is used. |
sigma |
Value to be passed to either argument |
bandwidth |
Set the maximal bandwidth |
round |
The smoothing results of either |
lambdabreak |
Penalty term (λ') used during the "Optimization of the number of breakpoints" step. |
lambdaclusterGen |
Penalty term (λ*) used during the "clustering throughout the genome" step. |
param |
Parameter of kernel used in the penalty term. |
alpha |
Risk alpha used for the "Outlier detection" step. |
msize |
The outliers MAD are calculated on regions with a cardinality greater or equal to msize. |
method |
The agglomeration method to be used during the "clustering throughout the genome" steps. |
nmin |
Minimum number of clusters (N*max) allowed during the "clustering throughout the genome" clustering step. |
nmax |
Maximum number of clusters (N*max) allowed during the "clustering throughout the genome" clustering step. |
region.size |
The breakpoints which define regions with a number of probes lower or equal to this value are discared. |
amplicon |
Level (and outliers) with a smoothing value (log-ratio value) greater than this threshold are consider as amplicon. Note that first, the data are centered on the normal reference value computed during the "clustering throughout the genome" step. |
deletion |
Level (and outliers) with a smoothing value (log-ratio value) lower than this threshold are consider as deletion. Note that first, the data are centered on the normal reference value computed during the "clustering throughout the genome" step. |
deltaN |
Region with smoothing values in between the interval [-deltaN,+deltaN] are supposed to be normal. |
forceGL |
Level with smoothing value greater (lower) than
|
nbsigma |
For each breakpoints, a weight is calculated which is a function of absolute value of the Gap between the smoothing values of the two consecutive regions. Weight = 1- kernelpen(abs(Gap),param=c(d=nbsigma*Sigma)) where Sigma is the standard deviation of the LogRatio. |
MinBkpWeight |
Breakpoints which |
DelBkpInAmp |
If TRUE, the breakpoints identified inside amplicon regions are deleted. For amplicon, the log-ratio values are highly variable which lead to identification of false positive breakpoints. |
DelBkpInDel |
If TRUE, the breakpoints identified inside deletion regions are deleted. For deletion, the log-ratio values are highly variable which lead to identification of false positive breakpoints. |
CheckBkpPos |
If |
assignGNLOut |
If |
breaksFdrQ |
breaksFdrQ for HaarSeg algorithm. |
haarStartLevel |
haarStartLevel for HaarSeg algorithm. |
haarEndLevel |
haarEndLevel for HaarSeg algorithm. |
weights.name |
The name of the fields which contains the weights used for the haarseg algorithm. By default, the value is set to NULL meaning that all the observations have the same weights. If provided, the field must contain positive values. |
verbose |
If |
... |
... |
The function daglad
implements a slightly modified
version of the methodology described in the article : Analysis of array CGH data: from signal
ratio to gain and loss of DNA regions (Hupé et al., Bioinformatics,
2004). For smoothing, it is possible
to use either the AWS algorithm (Polzehl and Spokoiny, 2002) or the HaarSeg algorithm (Ben-Yaacov and Eldar, Bioinformatics, 2008).
The daglad
function allows to choose some threshold to help the algorithm to identify the status of the genomic regions. The threshodls are given in the following parameters:
deltaN
forceGL
deletion
amplicon
|
An object of class "profileCGH" with the following attributes: |
profileValues |
is a data.frame with the following information:
|
BkpInfo |
is a data.frame sum up the information for each breakpoint:
|
NormalRef |
If |
People interested in tools dealing with array CGH analysis can visit our web-page http://bioinfo.curie.fr.
Philippe Hupé, glad@curie.fr.
Hupé et al. (Bioinformatics, 2004): Analysis of array CGH data: from signal ratio to gain and loss of DNA regions.
Polzehl and Spokoiny (WIAS-Preprint 787, 2002): Local likelihood modelling by adaptive weights smoothing.
Ben-Yaacov and Eldar (Bioinformatics, 2008): A fast and flexible method for the segmentation of aCGH data.
glad
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | data(snijders)
gm13330$Clone <- gm13330$BAC
profileCGH <- as.profileCGH(gm13330)
###########################################################
###
### daglad function
###
###########################################################
res <- daglad(profileCGH, mediancenter=FALSE, normalrefcenter=FALSE, genomestep=FALSE,
smoothfunc="lawsglad", lkern="Exponential", model="Gaussian",
qlambda=0.999, bandwidth=10, base=FALSE, round=1.5,
lambdabreak=8, lambdaclusterGen=40, param=c(d=6), alpha=0.001, msize=2,
method="centroid", nmin=1, nmax=8,
amplicon=1, deletion=-5, deltaN=0.10, forceGL=c(-0.15,0.15), nbsigma=3,
MinBkpWeight=0.35, CheckBkpPos=TRUE)
### data for cytoband
data(cytoband)
### Genomic profile on the whole genome
plotProfile(res, unit=3, Bkp=TRUE, labels=FALSE, Smoothing="Smoothing",
main="Breakpoints detection: DAGLAD analysis", cytoband = cytoband)
###Genomic profile for chromosome 1
plotProfile(res, unit=3, Bkp=TRUE, labels=TRUE, Chromosome=1,
Smoothing="Smoothing", main="Chromosome 1: DAGLAD analysis", cytoband = cytoband)
### The standard-deviation of LogRatio are:
res$SigmaC
### The list of breakpoints is:
res$BkpInfo
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.