cnv.break.annot: Identification of recurrently altered genes using CNV data...

Description Usage Arguments Value Examples

Description

Identification of recurrently altered genes using CNV data Identify recurrently altered genes by CNV. The function will identify overlaps between genomic features (e.g. genes) and CNV breakpoints. As opposed to 'gene.cnv' function that returns the overal CNV of each gene, this function allows identifying sub-genic events and may help detecting other rearrangements.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cnv.break.annot(
  cnv,
  fc.pct = 0.2,
  genome.v = "hg19",
  genesgr = NULL,
  upstr = 150000,
  dnstr = 150000,
  break.width = 10000,
  min.cnv.size = NULL,
  min.num.probes = NULL,
  low.cov = NULL,
  clean.brk = NULL,
  verbose = TRUE
)

Arguments

cnv

(S4) an object of class svcnvio containing data type 'cnv' validated by validate.cnv

fc.pct

(numeric) copy number change between 2 consecutive segments: i.e (default) cutoff = 0.2 represents a fold change of 0.8 or 1.2.

genome.v

(character): either 'hg19' or 'hg38' accepted; reference genome version to retrieve gene annotations including genomic coordinates and strand

genesgr

(S4) a GenomicRanges object containing gene annotations (if not NULL overides genome.v). It is crutial that the genome version 'genesgr' and the input 'sv' are the same. The GRanges object must contain 'strand' and a metadata field 'gene_id' with unique values. Seqnames are expected in the format (chr1, chr2, ...).

upstr

(numeric) size in base pairs to define gene upstream region onto which breakpoint overlaps will be identified. The strand value, start and stop positions defined in genesgr will be used to create a GRanges object of upstream regions.

dnstr

(numeric) size in base pairs to define gene downstream region onto which breakpoint overlaps will be identified. The strand value, start and stop positions defined in genesgr will be used to create a GRanges object of downstream regions.

break.width

(numeric) maximum breakpoint size to be considered

min.cnv.size

(numeric) The minimun segment size (in base pairs) to include in the analysis

min.num.probes

(numeric) The minimun number of probes per segment to include in the analysis

low.cov

(data.frame) a data.frame (chr, start, end) indicating low coverage regions to exclude from the analysis

clean.brk

(numeric) Identical segments removal when present in above a given number. Identical CNV segments across multiple samples may represent artifact of common germline variants, this is particularly relevant when the segmentation data was generated with a non-paired reference. For paired datasets (e.g. tumor vs. normal) better leave as NULL.

verbose

(logical) whether to return internal messages

Value

an instance of the class 'break.annot' containing breakpoint mapping onto genes

Examples

1
2
3
4
# Initialize CNV data
cnv <- validate.cnv(segdat_lung_ccle)

cnv.break.annot(cnv)

ccbiolab/svpluscnv documentation built on Sept. 9, 2020, 4:52 a.m.