read.modbam2bed: Construct BSseq objects from nanopore BED files

View source: R/read.modbam2bed.R

read.modbam2bedR Documentation

Construct BSseq objects from nanopore BED files

Description

Construct BSseq objects from nanopore BED files

Usage

read.modbam2bed(
    files,
    colData = NULL,
    rmZeroCov = FALSE,
    strandCollapse = TRUE
)

Arguments

files

vector, BED files

colData

data frame, phenotypic data with samples as rows and variables as columns

rmZeroCov

A logical (1) indicating whether methylation loci that have zero coverage in all samples be removed

strandCollapse

A logical (1) indicating whether stand-symmetric methylation loci (i.e. CpGs) should be collapsed across strands

Details

This function reads in nanopore sequencing modified BED files to Bsseq objects. Nanopore sequencing data (i.e. aggregated modified base counts) is stored in modified-base BAM files. These modified-base BAM files are converted to bedMethyl (BED) files using modbam2bed.

Details for using modbam2bed

After installing modbam2bed, a conda environment is activated. Index files for BAM files are created using samtools index. The code requires aligned reads with the Mm and Ml tags (MM and ML also supported), and the reference sequence used for alignment (<reference.fasta>).

  • -e, -- extended to output canonical, modified, and filtered bases;

  • -m, -- mod_base=BASE to output modified base of interest, one of: 5mC, 5hmC, 5fC, 5caC, 5hmU, 5fU, 5caU, 6mA, 5oxoG, Xao. (Or modA, modC, modG, modT, modU, modN for generic modified base);

  • -r, --region=chr:start-end to output chromosome or genomic region of interest;

  • -f, --threshold=THRESHOLD to output filtered bases for probability lower than threshold (default = 0.66)

modbam2bed to Bsseq object

After creating BED files using modbam2bed, the BED files are read in and the Bsseq object is constructed via read.modbam2bed() function. The function reads in BED files, extract genomic regions, methylation, coverage, ambiguous modification status data and sample information and then construct Bsseq object using BSseq function within the package.

Value

BSseq object

Examples

files <- c(system.file("extdata/modbam2bed/ctr1.chr10.chr11.bed.gz", package = "bsseq"),
    system.file("extdata/modbam2bed/ctr2.chr10.chr11.bed.gz", package = "bsseq"),
    system.file("extdata/modbam2bed/ctr3.chr10.chr11.bed.gz", package = "bsseq"),
    system.file("extdata/modbam2bed/tret1.chr10.chr11.bed.gz", package = "bsseq"),
    system.file("extdata/modbam2bed/tret2.chr10.chr11.bed.gz", package = "bsseq"),
    system.file("extdata/modbam2bed/tret3.chr10.chr11.bed.gz", package = "bsseq"))
pd <- data.frame(condition = rep(c("control", "treatment"), each = 3),
    replicate = rep(c("rep1", "rep2", "rep3"), times = 2))
bsseq_nano <- bsseq::read.modbam2bed(files,colData=pd,rmZeroCov = FALSE,
    strandCollapse=TRUE)

hansenlab/bsseq documentation built on Dec. 17, 2024, 1:33 a.m.