squaremodel: Calculate potential fits for a single sample using ploidy as...
In ACE: Absolute Copy Number Estimation from Low-coverage Whole Genome Sequencing

Description Usage Arguments Details Value Note Author(s) See Also Examples

squaremodel performs a "two-dimensional" fitting algorithm on a single sample. It calculates the error of the fit at each cellularity over a range of "ploidies". Input can be either a template or a QDNAseq-object with the index of the sample specified. Returns a list with input parameters (method, penalty, and penploidy) and model characteristics (an error matrix, a logical matrix specifying minima, a data frame with all information, a data frame with only minima, and a graphical representation of the error matrix).

1
2
3

squaremodel(template, QDNAseqobjectsample = FALSE, prows=100, 
  ptop=5, pbottom=1, method = 'RMSE', exclude = c(), 
  penalty = 0, penploidy = 0, highlightminima = TRUE, standard)

`template`	Object. Either a data frame as created by `objectsampletotemplate`, or a QDNAseq-object
`QDNAseqobjectsample`	Integer. Specifies which sample to analyze from the QDNAseqobject. Required when using a QDNAseq-object as template. Default = FALSE
`prows`	Integer. Sets the resolution of the ploidy-axis. Determines how many decrements are used to get from ptop to pbottom (see below). Therefore, the actual number of rows is actually prows + 1. Default = 100
`ptop`	Numeric. Sets the highest ploidy at which to start testing fits. Default = 5
`pbottom`	Numeric. Sets the lowest ploidy to be tested. Default = 1
`method`	Character string specifying which error method to use. For more documentation, consult the vignette. Can be "RMSE", "SMRE", or "MAE". Default = "RMSE"
`exclude`	Integer or character vector. Specifies which chromosomes to exclude for model fitting
`penalty`	Numeric. Penalizes fits at lower cellularities. Suggested values between 0 and 1. Default = 0 (no penalty)
`penploidy`	Numeric. Penalizes fits that diverge from 2N with the formula (1+abs(ploidy-2))^penploidy. Default = 0
`highlightminima`	Logical. Minima are highlighted in the matrixplot by a black dot. Default = TRUE
`standard`	Numeric. Force the ploidy to represent this raw value. When omitted, the standard will be calculated from the data

Unlike other functionality of ACE, squaremodel does not use the "standard", but it fits all tested ploidies to "standard = 1". It is therefore necessary that segment values are normalized to 1 (which they are by default coming from QDNAseq). The penalty parameter is the same as in singlemodel. Additionally, it is possible to penalize fits at ploidies diverging from 2N using the penploidy parameter. For other details on the fitting algorithm, see singlemodel. Range of ploidies is set by parameters ptop and pbottom, and resolution is determined by prows. To create good contrast in the matrixplot, the color scale derives from the inverse of the error, and the opacity of the dots marking the minima is calculated as min(error)/error.

Returns a list, containing

`method`	Applied error method
`penalty`	Applied penalty factor for low cellularities
`penploidy`	Applied penalty factor for diverging ploidies
`errormatrix`	Numeric matrix with errors of all combinations of ploidy and cellularity
`minimatrix`	Logical matrix indicating whether the combination of ploidy and cellularity represents a minimum
`errordf`	Data frame with columns ploidy, cellularity, error, and minimum
`minimadf`	Same as errordf, but only containing minima and sorted by error
`matrixplot`	ggplot2-graph of the relative errors calculated at each combination of ploidy and cellularity

squaremodel() only needs a data frame with columns named chr and segments. Every row should contain an individual genomic feature, i.e. a bin or a probe. If you have data with each row representing a segment, and the size of the segment given in a column (e.g. NumBins or NumProbes), you can create the data frame as follows (giving the correct variable names of course):

chr <- rep(Chromosome, NumProbes)

segments <- rep(SegmentMean, NumProbes)

template <- cbind(chr, segments)

Jos B. Poell

objectsampletotemplate, squaremodel, singleplot

## toy data assuming each chromosome comprises 100 bins
s <- jitter(c(1, 1, 0.8, 1.2, rep(1, 5), 1.4, rep(1, 13)), amount = 0)
n <- c(100, 100, 40, 60, rep(100, 5), 100, rep(100, 13))
df <- data.frame(chr = rep(1:22, each = 100), segments = rep(s, n))
squaremodel(df)$matrixplot
sm <- squaremodel(df, method = 'MAE', penalty = 0.5, penploidy = 0.5)
sm$matrixplot
mdf <- sm$minimadf
head(mdf[order(mdf$error,-mdf$cellularity),])

## using segmented data from a QDNAseq-object
data("copyNumbersSegmented")
sqm <- squaremodel(copyNumbersSegmented, QDNAseqobjectsample = 2, 
  penalty = 0.5, penploidy = 0.5, 
  ptop = 4.3, pbottom = 1.8, prows = 250)
sqm$matrixplot
mdf <- sqm$minimadf
head(mdf[order(mdf$error,-mdf$cellularity),])