hg38.30kbp.SR50: hg38 30kbp bin annotations

hg38.30kbp.SR50R Documentation

hg38 30kbp bin annotations

Description

Bin annotations are caclulated for non overlapping 30kbp bins generated as described in Scheinin et al. (see references). The annotated data frame contains:

  • chromosome: Chromosome name,

  • start: Base pair start position,

  • end: Base pair end position,

  • bases: Percentage of non-N nucleotides (of full bin size),

  • gc: Percentage of C and G nucleotides (of non-N nucleotides),

  • mappability: Average mappability of 50mers with a maximum of 2 mismatches as described in by Derrien et al. (see references),

  • blacklist: Percent overlap with ENCODE blacklisted regions (see references),

  • residual: Median loess residual calculated from 1000 Genomes (see references),

  • use: Whether the bin should be used in subsequent analysis steps,

Value

Returns an AnnotatedDataFrame object.

Author(s)

Daoud Sie

References

DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly. Scheinin I, Sie D, Bengtsson H, van de Wiel M, Olshen A, van Thuijl H, van Essen H, Eijk P, Rustenburg F, Meijer G, Reijneveld J, Wesseling P, Pinkel D, Albertson D, Ylstra B 2014 Genome Research vol: 24 (12) pp: 1–11

Fast Computation and Applications of Genome Mappability. Derrien T, Estelle J, Sola S, Knowles D, Raineri E, Guigo R, Ribeca P January 19, 2012 PLOS ONE doi: 10.1371/journal.pone.0030377

An integrated map of genetic variation from 1,092 human genomes. 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA 2012 Nature Nov 1; 491(7422):56–65.

An integrated encyclopedia of DNA elements in the human genome. ENCODE Project Consortium 2012 Nature Sep 6; 489(7414):57–74.

Examples

data("hg38.30kbp.SR50")
assign("bins", get("hg38.30kbp.SR50"))
## Not run: readCounts <- binReadCounts(bins=bins, path="./bam")

# or

bins <- getBinAnnotations(binSize=30, genome="hg38")
## Not run: readCounts <- binReadCounts(bins=bins, path="./bam")

asntech/QDNAseq.hg38 documentation built on Aug. 11, 2022, 9:07 p.m.