calcChromArmPloidies: Calculate overall chrom arm copy numbers

Description Usage Arguments Value Examples

View source: R/calcChromArmPloidies.R

Description

This function first rounds copy numbers (CN) to integers so that CN segments can be grouped together. Per chrom arm, the coverage of each CN category is calculated (i.e. cumulative segment size). The chrom arm CN is (roughly) defined as the CN category with the highest cumulative segment size

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
calcChromArmPloidies(
  purple.cnv.file,
  out.file = NULL,
  min.rel.cum.segment.size = 0.5,
  max.rel.cum.segment.size.diff = 0.1,
  chrom.arm.split.method = "hmf",
  centromere.positions.path = CENTROMERE_POSITIONS,
  one.armed.chroms = ONE_ARMED_CHROMS,
  chrom.arm.names = "auto",
  verbose = T
)

Arguments

purple.cnv.file

Path to purple cnv file

out.file

Path to output file. If NULL, returns a named vector

min.rel.cum.segment.size

If a chrom arm has a CN category that covers >0.5 (i.e 50 of a chrom arm, this CN is the copy number of the arm

max.rel.cum.segment.size.diff

This value (default 0.1) determines whether which CN categories are considered to cover equal lengths of the chrom arm. For example, (by default) 2 CN categories covering 0.40 and 0.31 of a chrom arm are considered equally contributing. When these CN categories have similar cumulative segment size as the one with the highest, if one of these have the same CN as the genome CN, return the genome CN. Otherwise, simply return the one with the highest segment support (as is done above).

chrom.arm.split.method

Which method to determine the chromosome arm coords? If 'hmf', uses 'method' column from purple cnv file to determine centromere positions (i.e. p/q arm split point). If 'gap', uses the a (processed) gap.txt.gz table from the UCSC genome browser to determine centromere positions. These 2 methods should in theory be identical, unless the HMF pipeline code changes.

chrom.arm.names

A character vector in the form c('1p','1q','2p','2q', ...). The default 'auto' means that the human chromosome arm names are used. Note that chroms 13, 14, 15, 21, 22 are considered to only have the long (i.e. q) arm.

verbose

Show progress messages?

Value

A named vector of chrom arm copy numbers, or specified writes a table to out.file if specified

Examples

1
2
3
When multiple CNs have similar segment support as the one with the highest, if one 
of these have the same CN as the genome CN, return the genome CN. Otherwise, simply return 
the one with the highest segment support (as is done above)

luannnguyen/hmfGeneAnnotation documentation built on May 6, 2020, 1:07 p.m.