get_mapp: Compute mappability

View source: R/get_mapp.R

get_mappR Documentation

Compute mappability

Description

Compute mappability for each bin. Note that scDNA sequencing is whole-genome amplification and the mappability score is essential to determine variable binning method. Mappability track for 100-mers on the GRCh37/hg19 human reference genome from ENCODE is pre-saved. Compute the mean of mappability scores that overlapped reads map to bins, weighted by the width of mappability tracks on the genome reference. Use liftOver utility to calculate mappability for hg38, which is pre-saved as well. For mm10, there are two workarounds: 1) set all mappability to 1 to avoid extensive computation; 2) adopt QC procedures based on annotation results, e.g., filter out bins within black list regions, which generally have low mappability.

Usage

get_mapp(ref, hgref = "hg19")

Arguments

ref

GRanges object returned from get_bam_bed

hgref

reference genome. This should be 'hg19', 'hg38' or 'mm10'. Default is human genome hg19.

Value

mapp

Vector of mappability for each bin/target

Author(s)

Rujin Wang rujin@email.unc.edu

Examples

## Not run: 
library(WGSmapp)
library(BSgenome.Hsapiens.UCSC.hg38)
bamfolder <- system.file('extdata', package = 'WGSmapp')
bamFile <- list.files(bamfolder, pattern = '*.dedup.bam$')
bamdir <- file.path(bamfolder, bamFile)
sampname_raw <- sapply(strsplit(bamFile, '.', fixed = TRUE), '[', 1)
bambedObj <- get_bam_bed(bamdir = bamdir,
                            sampname = sampname_raw, 
                            hgref = "hg38")
bamdir <- bambedObj$bamdir
sampname_raw <- bambedObj$sampname
ref_raw <- bambedObj$ref

mapp <- get_mapp(ref_raw, hgref = "hg38")

## End(Not run)


rujinwang/SCOPE documentation built on Jan. 1, 2023, 5:40 a.m.