init: Creates a CNAqc object.

View source: R/init.R

initR Documentation

Creates a CNAqc object.

Description

Creates a CNAqc object from a set of mutations (SNVs or indels), allele-specific copy numbers and a tumour purity value. The resulting object retains the input mutations that map on top of the copy number segments, and allows for the computation of the QC metrics available in the CNAqc package.

Genomic coordinates in relative (per-chromosome) format are transformed into absolute coordinates by means of a reference genome providing the length of each chromosome. CNAqc supports 'hg19'/'GRCh37' and 'hg38'/'GRCh38' references, which are embedded into the package as 'CNAqc::chr_coordinates_hg19' and 'CNAqc::chr_coordinates_GRCh38'. An abitrary reference can also be provided it is stored in am equivalent format.

Usage

init(mutations, snvs = NULL, cna, purity, sample = "MySample", ref = "GRCh38")

Arguments

mutations

A dataframe of mutations with the following fields:

* 'chr' chromosome name, e.g., "chr3", "chr8", "chrX", ...; * 'from' where the mutation start, an integer number; * 'to' where the mutation ends, an integer number; * 'ref' reference allele, e.g., "A", "ACC", "AGA", ...; * 'alt' alternative allele, e.g., "A", "ACC", "AGA", ...; * 'DP' sequencing depth at the locus, an integer number; * 'NV' number of reads with the variant at the locus, an integer number; * 'VAF' variant allele frequency (VAF), defined as 'NV/DP', at the locus, a real number in [0,1].

Optionally, driver mutations can be annotated. In this case the input dataframe needs to report:

* 'is_driver' a boolean flag for the driver status; * 'driver_label' the driver label that will appear in each plot, e.g., 'BRAV V600E'.

snvs

Deprecated parameter.

cna

A dataframe of allele-specific copy number with the following fields:

* 'chr' chromosome name, e.g., "chr3", "chr8", "chrX", ... * 'from' where the segment start, an integer number * 'to' where the segment ends, an integer number * 'Major' for the number of copies of the major allele (or A-allele), an integer number * 'minor' for the number of copies of the major allele (or B-allele), an integer number * 'CCF' an optional cancer cell fraction (CCF) column distinguishing clonal and subclonal segments, a real number in [0,1] * 'Major_2' optional for the number of copies of the major allele (or A-allele) in the second clone if present, an integer number * 'minor_2' optional for the number of copies of the major allele (or B-allele) in the second clone if present, an integer number

If the 'CCF' value is present and equal to 1, a segment is considered clonal, otherwise subclonal. If a segment is subclonal:

* the columns 'Major' and 'minor' are interpreted as those for a subclone with proportion equal to the 'CCF' value; * the columns 'Major_2' and 'minor_2' are interpreted as those for a second subclone with proportion equal to the '1 - CCF' value;

purity

Value in between '0' and '1' to represent the proportion of actual tumour content (sometimes called "cellularity").

sample

Sample name (a string).

ref

A key word for the used reference coordinate system. CNAqc supports 'hg19'/'GRCh37' and 'hg38'/'GRCh38' references, which are embedded into the package as 'CNAqc::chr_coordinates_hg19' and 'CNAqc::chr_coordinates_GRCh38'. An abitrary reference can also be provided if 'ref' is a dataframe in the same format as 'CNAqc::chr_coordinates_hg19' or 'CNAqc::chr_coordinates_GRCh38'. The default reference is 'GRCh38'.

Value

A CNAqc object of class 'cnaqc', with S3 methods for printing, plotting and analyzing data.

Examples

# Example input data released with the package
data('example_dataset_CNAqc', package = 'CNAqc')
print(example_dataset_CNAqc)

# Note the outputs to screen
x = init(mutations = example_dataset_CNAqc$mutations, cna = example_dataset_CNAqc$cna, purity = example_dataset_CNAqc$purity)

# An S3 method can be used to report to screen what is in the object
print(x)

caravagnalab/CNAqc documentation built on Oct. 31, 2024, 3:54 a.m.