readBam: Read bam file into a curated GRanges object

View source: R/readBam.R

readBamR Documentation

Read bam file into a curated GRanges object

Description

Read bam file into a curated GRanges object

Usage

readBam(
  bamfile,
  curate_start_and_end = TRUE,
  use_names = TRUE,
  chromosome_to_keep = paste0("chr", 1:22),
  strand_mode = 1,
  genome_label = "hg19",
  outdir = NA,
  galp_flag = Rsamtools::scanBamFlag(isPaired = TRUE, isDuplicate = FALSE,
    isSecondaryAlignment = FALSE, isUnmappedQuery = FALSE, isSupplementaryAlignment =
    FALSE),
  galp_what = c("cigar", "mapq", "isize"),
  galp_tag = c("NM", "MD"),
  galp_mapqFilter = 40,
  galp_bqFilter = 20,
  mutation_file = NULL,
  mut_fragments_only = FALSE,
  ...
)

Arguments

bamfile

The bam file name.

curate_start_and_end

A logical (TRUE or FALSE) parameter for curating alignment start and end coordinates. The default value is TRUE.

use_names

Logical, assigns read names to GRanges, default TRUE.

chromosome_to_keep

Should be a character vector containing the seqnames to be kept in the GRanges object or boolean FALSE. FALSE means not filtering. Default is paste0("chr", 1:22).

strand_mode

The strand_mode = 1 means the strand of the pair is strand of its first alignment. For More Details please see GenomicAlignments docs.

genome_label

The Genome you used in the alignment. Should be "hg19" or "hg38" or "hg38-NCBI". Default is "hg19". Note: "hg19" will load BSgenome.Hsapiens.UCSC.hg19 package, which is Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg19, based on GRCh37.p13) and stored in Biostrings objects; "hg38" will load BSgenome.Hsapiens.UCSC.hg38 package, which is Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg38, based on GRCh38.p13) and stored in Biostrings objects. "hg38-NCBI" will load BSgenome.Hsapiens.NCBI.GRCh38 package, which is full genome sequences for Homo sapiens (Human) as provided by NCBI (GRCh38, 2013-12-17) and stored in Biostrings objects.

outdir

The path for saving rds file. Default is NA, i.e. not saving.

galp_flag

ScanBamFlag object for BAM file scanning flags.

galp_what

A character vector of fields to return from the BAM file. The specific fields returned depend on the analysis context: - For analyses with mutational files or when using 'call_mutations', 'galp_what' should include "cigar", "mapq", "isize", "seq", and "qual". - For fragmentomic analysis without mismatch mutational data, 'galp_what' should include just "cigar", "mapq", and "isize". This parameter's values and their descriptions are available on the Rsamtools::scanBamWhat() function and detailed on the Rsamtools::scanBam help page.

galp_tag

Vector specifying optional fields (tags) to retrieve.

galp_mapqFilter

Minimum mapping quality to include a read.

galp_bqFilter

Base quality threshold for filtering read pairs. When a mutation_file is provided, this parameter specifies the minimum base quality required at the mutation locus. Read pairs not meeting this threshold at the mutation site are discarded, irrespective of the base.

mutation_file

An optional file containing a list of mutations. Fragments that overlap the mutation loci are mutationally annotated.

mut_fragments_only

A logical (TRUE or FALSE) parameter that determines the type of genomic alignments to retrieve from the BAM file. If set to TRUE, the function will only retrieve alignments that overlap with specified mutation locations. If set to FALSE, it will retrieve all alignments that pass the selected filters. The default value is FALSE.

...

Further arguments passed to or from other methods.

Value

This function returns a curated GRanges object.

Author(s)

Haichao Wang

Paulius D. Mennea

Examples

## Not run: 

object <- readGALP(bamfile = "/path/to/bamfile.bam",
                   outdir = "./",
                   chromosome_to_keep = c("chr1", "chr2", "chr3"))

## End(Not run)


hw538/cfDNAPro documentation built on Feb. 17, 2025, 6:09 p.m.