bam2count: bam2count

View source: R/MODULE_2_PREP.R

bam2countR Documentation

bam2count

Description

Function to generate a read counts table from bam files

Usage

bam2count(bamfolder, annotation)

Arguments

bamfolder

Path to the folder containing bam files (one bam file per sample is expected)

annotation

Annotation data table produced by read_annotation listing transcript names and lengths of their 5'UTR, CDS and 3'UTR segments. It has five columns: transcript, l_tr, l_utr5, l_cds and l_utr3. Transcript names and segment lengths must correspond to the reference sequences to which the reads were mapped.

Details

This function is designed for bam files generated by mapping to a transcriptome (not a genome). For each sample, it counts the number of reads mapping to different chromosomes (the 'seqname' bam field). Annotation must be provided to ensure a complete transcript list, including those with zero counts. Low count transcripts can be filtered out later using the min_count_filter function. RNA and RPF bam files can be placed in the same folder and imported together using this function if CELP bias correction on RPF counts is not desired. If CELP correction is desired, RNA and RPF bams should be placed in separate folders. RNA bams should be imported using this function; RPF bams should be imported using the bamtolist_rW function following the CELP workflow.

Value

A data frame where the first column contains transcript IDs and the remaining columns contain read counts in imported samples.

Examples

rna_count_LMCN <- bam2count(bamfolder = "./Data/Bam/RNA", annotation = annotation_human_cDNA)
rpf_count_LMCN <- bam2count(bamfolder = "./Data/Bam/RPF", annotation = annotation_human_cDNA)

goodarzilab/Ribolog documentation built on Oct. 7, 2022, 10:14 p.m.