get_counts: Obtain read counts from 'bam' object.

Description Usage Arguments Value Examples

Description

get_counts allows to obtain read counts quickly and easily from RNASeq data to be used in downstream analyses.

Usage

1
2
3
4
5
6
get_counts(reads, annotation, transcript_id = "transcript_id",
  gene_id = "gene_id", mismatches = -1L, minoverlap = 1L,
  feature = c("gene_exon", "gene", "exon", "intron"), type = c("default",
  "union", "disjoin", "intersect", "longest", "shortest", "overlap"),
  library = c("unstranded", "firststrand", "secondstrand"), paired = FALSE,
  multiple_feature_overlaps = FALSE, verbose = FALSE)

Arguments

reads

A bam/bed object (see gread::read_format) or complete path to a bam/bed file.

annotation

A gtf/gff object (see gread::read_format) or complete path to gtf/gff file.

transcript_id

Column name in x corresponding to transcript id. Default value is "transcript_id".

gene_id

Column name in x corresponding to gene id. Default value is "gene_id".

mismatches

Default -1 is to ignore mismatches. If > 0, reads with mismatches <= mismatches are alone retained.

minoverlap

Argument that's passed to GenomicRanges::findOverlaps. Default is 1L, i.e., select all overlapping reads.

feature

How to count reads?

"gene_exon" counts only exonic reads within genes.

"gene" counts any/all reads overlapping the gene.

"exon" returns read counts for each exon separately.

"intron" returns read counts for each intron separately.

See type argument for more advanced operations on extracting feature coordinates.

type

Same as gread::extract. See ?extract in gread.

library

Either "unstranded" (default), "firststrand" or "secondstrand".

paired

Default is FALSE. If the library is paired end, set it to TRUE

multiple_feature_overlaps

logical. Should reads that overlap multiple features be counted. Default is FALSE, i.e., to discard. If TRUE, reads across overlapping features will be counted.

verbose

logical. Default is FALSE. If TRUE, provides helpful messages to the console.

Value

A data.table with calculated raw counts of overlapping reads for each feature.

Examples

1
2
3
4
5
6
path <- system.file("tests", package="gcount")
gtf_file <- file.path(path, "sample.gtf")
bam_file <- file.path(path, "sample.bam")
bam_counts <- get_counts(bam_file, gtf_file, feature="gene_exon", 
             type="union", paired=FALSE, library="unstranded", 
             verbose=TRUE)

asrinivasan-oa/gcount documentation built on May 12, 2019, 5:37 a.m.