getBamInfo: Obtain library information from BAM files

Description Usage Arguments Details Value Author(s) Examples

View source: R/main.R

Description

Obtain paired-end status, median aligned read length, median aligned insert size and library size from BAM files.

Usage

1
getBamInfo(sample_info, yieldSize = NULL, cores = 1)

Arguments

sample_info

Data frame with sample information including mandatory columns “sample_name” and “file_bam”. Column “sample_name” must be a character vector. Column “file_bam” can be a character vector or BamFileList.

yieldSize

Number of records used for obtaining library information, or NULL for all records

cores

Number of cores available for parallel processing

Details

BAM files must have been generated with a splice-aware alignment program that outputs the custom tag ‘XS’ for spliced reads, indicating the direction of transcription. BAM files must be indexed.

Library information can be inferred from a subset of BAM records by setting the number of records via argument yieldSize. Note that library size is only obtained if yieldSize is NULL.

Value

sample_info with additional columns “paired_end”, “read_length”, “frag_length”, and “lib_size” if yieldSize is NULL

Author(s)

Leonard Goldstein

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
path <- system.file("extdata", package = "SGSeq")
si$file_bam <- file.path(path, "bams", si$file_bam)

## data.frame as sample_info and character vector as file_bam
si <- si[, c("sample_name", "file_bam")]
si_complete <- getBamInfo(si)

## DataFrame as sample_info and BamFileList as file_bam
DF <- DataFrame(si)
DF$file_bam <- BamFileList(DF$file_bam)
DF_complete <- getBamInfo(DF)

ldg21/SGSeq documentation built on Oct. 14, 2020, 9:51 p.m.