runACE: Absolute Copy number Estimation

View source: R/ACE.R

runACER Documentation

Absolute Copy number Estimation

Description

ACE scales copy number data to fit with integer copy numbers, providing an estimate for tumor cell percentage in the process. runACE uses segmented data from the QDNAseq package. A folder with either bam-files (aligned sequencing data) or rds-files of QDNAseq-objects can be used as input. Model fitting and production of all output files (except "parameters.tsv") is executed by ploidyplotloop, which handles one QDNAseq-object at a time.

Usage

runACE(inputdir = "./", outputdir, filetype = 'rds', genome = 'hg19',
  binsizes, ploidies = 2, imagetype = 'pdf', method = 'RMSE', penalty = 0, 
  cap = 12, bottom = 0, trncname = FALSE, printsummaries = TRUE, 
  savereadcounts = FALSE, autopick = FALSE)
  
ploidyplotloop(copyNumbersSegmented, currentdir, ploidies = 2, 
  imagetype = 'pdf', method = 'RMSE', penalty = 0, cap = 12, 
  bottom = 0, trncname = FALSE, printsummaries = TRUE, 
  autopick = FALSE)

Arguments

inputdir

Character string specifying the directory containing the files you want analyzed. Note: will analyze ALL rds-files or bam-files in the given directory. Default = "./"

copyNumbersSegmented

QDNAseq-object with segmented data

outputdir, currentdir

Character string specifying the directory to which ACE should write the output. When missing, ACE will try to write to inputdir. For ploidyplotloop, currentdir is required.

filetype

Character string specifying the file type of your input, either "bam" or "rds". Default = "rds"

genome

Character string specifying genome and version. Availability depends on QDNAseq. Default = "hg19"

binsizes

Numeric vector, specifying which binsizes (in kbp) to analyze. Possible values are 1, 5, 10, 15, 30, 50, 100, 500, and 1000. When omitted, defaults to c(100,500,1000)

ploidies

Numeric vector, specifying which ploidies (N) to analyze. Use positive natural numbers. Default = 2

imagetype

Character string specifying the image type graphics device, default = "pdf"

method

Character string specifying what error method to use. See also section "Error methods". Default = "RMSE"

penalty

Numeric value. Penalizes fits at lower cellularities. Suggested values between 0 and 1. Default = 0 (no penalty)

cap, bottom

Integer. Influences your output copy number graphs. The upper and lower limits of the y-axis are set at these values. Bins and segments that exceed/subceed the cap/bottom are represented by a special mark. Default = 12 and 0 respectively

trncname

Logical. Convenience functionality. If all your samples have a certain extension to their name, you can use this to truncate this extension and be left with the actual sample name. When TRUE, the regular expression is "_.*". That means it will chop off everything from the sample name starting with the first underscore. Instead of a logical, you can specify a character string to match your regular expression of choice. You can test whether it will work with the gsub function, since this is what ACE uses to truncate names. Default = FALSE

printsummaries

Logical. If you do not want the big summary files, you can set this argument to FALSE. If you still want the summary files containing only error plots, you can set this to 2. Default = TRUE

savereadcounts

Logical. When set to TRUE, readCounts-objects will be saved. Default = FALSE

autopick

Logical. When set to TRUE, ACE will fill in the cellularity of the best fit in the column likely_fit of the fitpicker file(s). Default = FALSE

Details

Since this is the core functionality of ACE, extensive documentation is available in the vignette.

Value

runACE and ploidyplotloop do not return any values, they print all their output to the indicated location. The output comprises

  • the file "parameters.tsv" which simply reports the used parameters

  • rds-files (only in case you had bam-files as input)

  • for each ploidy a "fitpicker.tsv" file which can be used for selecting the most likely fits

  • a summary file of likely fits and error plot of each sample (if printsummaries is set to TRUE)

  • a summary file of all error plots (if printsummaries is set to TRUE or 2)

  • a directory with copy number plots of the likely fits of all samples

  • a directory for each sample, containing the error plot, a summary file with all fits of the sample, and individual copy number plots of all fits in a subdirectory

Note

You can use the example data for testing: see Examples

Author(s)

Jos B. Poell

See Also

singlemodel, squaremodel, singleplot

Examples

## Not run: 
  runACE("./bam/", outputdir = "./results", penalty = 0.5, 
    binsizes = c(100, 1000), imagetype = 'png')
  data("copyNumbersSegmented")
  ploidyplotloop(copyNumbersSegmented, ".", ploidies = c(2,3))
  
## End(Not run)

tgac-vumc/ACE documentation built on Nov. 29, 2022, 12:15 a.m.