CGHcall: Calling aberrations for array CGH tumor profiles.

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/CGHcall.R

Description

Calls aberrations for array CGH data using a six state mixture model.

Usage

1
CGHcall(inputSegmented, prior = "auto", nclass = 5, organism = "human", cellularity=1, robustsig="yes", nsegfit=3000, maxnumseg=100, minlsforfit=0.5, build="GRCh37",ncpus=1)

Arguments

inputSegmented

An object of class cghSeg

prior

Options are all, not all, or auto. See details for more information.

nclass

The number of levels to be used for calling. Either 3 (loss, normal, gain), 4 (including amplifications), 5 (including double deletions).

organism

Either human or other. This is only used for chromosome arm information when prior is set to all or auto (and samplesize > 20).

cellularity

A vector of cellularities ranging from 0 to 1 to define the contamination of your sample with healthy cells (1 = no contamination). See details for more information.

robustsig

Options are yes or no. yes enforces a lower bound on the standard deviation of the normal segments

nsegfit

Maximum number of segments used for fitting the mixture model. Posterior probabilities are computed for all segments

maxnumseg

Maximum number of segments per profile used for fitting the model

minlsforfit

Minimum length of the segment (in Mb) to be used for fitting the model

build

Build of Humane Genome. Either GRCh37, GRCh36, GRCh35 or GRCh34.

ncpus

Number of cpus used for parallel calling. Has a large effect on computing time. ncpus larger than 1 requires package snowfall.

Details

Please read the article and the supplementary information for detailed information on the algorithm. The parameter prior states how the data is used to determine the prior probabilities. When set to all, the probabilities are determined using the entire genome of each sample. When set to not all probabilites are determined per chromosome for each sample when organism is set to other or per chromosome arm when organism is human. The chromosome arm information is taken from the March 2006 version of the UCSC database. When prior is set to auto, the way probabilities are determined depends on the sample size. The entire genome is used when the sample size is smaller than 20, otherwise chromosome (arm) information is used. Please note that CGHcall uses information from all input data to determine the aberration probabilities. When for example triploid or tetraploid tumors are observed, we advise to run CGHcall separately on those (groups of) samples. Note that robustsig = yes enforces the sd corresponding to the normal segments to be at least half times the pooled gain/loss sd. Use of nsegfit significantly lower computing time with respect to previous CGHcall versions without much accuracy loss. Moreover, maxnumseg decreases the impact on the results of profiles with inferior segmentation results. Finally, minlsforfit decreases the impact of very small aberations (potentially CNVs rather than CNAs) on the fit of the model. Note that always a result for all segments is produced. IN MOST CASES, CGHcall SHOULD BE FOLLOWED BY FUNCTION ExpandCGHcall.

Value

This function return a list with six components:

posteriorfin2

Matrix containing call probabilities for each segment. First column denotes profile number, followed by k columns with aberration probabilities for each sample, where k is the number of levels used for calling (nclass).

nclone

Number of clone or probes

nc

Number of samples

nclass

Number of classes used

regionsprof

Matrix containing information about the segments, 4 colums: profile, start probe, end probe, segmented value

params

Vector containing the parameter values of the mixture model

Author(s)

Sjoerd Vosse, Mark van de Wiel, Ilari Scheinin

References

Mark A. van de Wiel, Kyung In Kim, Sjoerd J. Vosse, Wessel N. van Wieringen, Saskia M. Wilting and Bauke Ylstra. CGHcall: calling aberrations for array CGH tumor profiles. Bioinformatics, 23, 892-894.

See Also

ExpandCGHcall

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
  data(Wilting)
  ## Convert to \code{\link{cghRaw}} object
  cgh <- make_cghRaw(Wilting)
  print(cgh)
  ## First preprocess the data
  raw.data <- preprocess(cgh)
  ## Simple global median normalization for samples with 75% tumor cells
  normalized.data <- normalize(raw.data)  
  ## Segmentation with slightly relaxed significance level to accept change-points.
  ## Note that segmentation can take a long time.
  ## Not run: segmented.data <- segmentData(normalized.data, alpha=0.02)
  ## Not run: postsegnormalized.data <- postsegnormalize(segmented.data)
  ## Call aberrations
  perc.tumor <- rep(0.75, 3)
  ## Not run: result <- CGHcall(postsegnormalized.data,cellularity=perc.tumor)
  
  ## Expand to CGHcall object
  ## Not run: result <- ExpandCGHcall(result,postsegnormalized.data)

Example output

Loading required package: impute
Loading required package: DNAcopy
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: CGHbase
Loading required package: marray
Loading required package: limma

Attaching package:limmaThe following object is masked frompackage:BiocGenerics:

    plotMA

Loading required package: snowfall
Loading required package: snow

Attaching package:snowThe following objects are masked frompackage:BiocGenerics:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, clusterSplit, parApply, parCapply,
    parLapply, parRapply, parSapply

The following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, clusterSplit, makeCluster, parApply,
    parCapply, parLapply, parRapply, parSapply, splitIndices,
    stopCluster


Attaching package:CGHcallThe following object is masked frompackage:BiocGenerics:

    normalize

cghRaw (storageMode: lockedEnvironment)
assayData: 4127 features, 5 samples 
  element names: copynumber 
protocolData: none
phenoData: none
featureData
  featureNames: CTB-14E10 RP11-465B22 ... CTB-99K24 (4127 total)
  fvarLabels: Chromosome Start End
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
Changing impute.knn parameter k from 10 to 4 due to small sample size.
Applying median normalization ... 
Smoothing outliers ... 
Warning message:
In DNAcopy::CNA(copynumber(input), chromosomes(input), bpstart(input),  :
  array has repeated maploc positions

CGHcall documentation built on Nov. 8, 2020, 11:12 p.m.