extract_complex_spike_dat: Load and extract spUMIs from sequencing reads with complex...

View source: R/UMIcountR-funs.R

extract_complex_spike_datR Documentation

Load and extract spUMIs from sequencing reads with complex molecular spikes set (5')

Description

extract_complex_spike_dat is used to extract and parse molecular spike reads from bam files.

Usage

extract_complex_spike_dat(
  bam_path,
  bc_df,
  spike_groundtruth = NULL,
  max_pattern_dist = 1,
  cores = 12,
  min_mapq_value = 255,
  fixed_start_pos = NULL
)

Arguments

bam_path

path to input bam file, must be indexed zUMIs output file

bc_df

data.frame containing expected spike-in names & barcodes

spike_groundtruth

data.table containing all known spike-in molecules (NOT IMPLEMENTED yet) Default: NULL

max_pattern_dist

number of sequencing errors allowed in the sequence pattern recognition used to extract barcode & spUMIs. Default:1

cores

number of CPU cores used. Default: 12

min_mapq_value

minimum MAPQ mapping quality (default only uniquely aligned reads). Default: 255

fixed_start_pos

require fixed starting position of BC/spUMI sequence in the read (given as integer). Default: NULL

Details

Barcodes are error corrected allowing 1 hamming distance.

Value

returns a data.table with reads, their UMI and the raw & error-corrected spUMI for each spike-in sequence and barcode.

See Also

BamInput,ScanBamParam data.table-package

Examples

## Not run: 
example_dat <- extract_complex_spike_dat(
 bam_path = bam,
 bc_df = spike_info,
 max_pattern_dist = 2,
 cores = 11
)

## End(Not run)

cziegenhain/UMIcountR documentation built on May 30, 2022, 5:38 p.m.