filterLowCounts: Filter low-count exons.

View source: R/filterLowCounts.R

filterLowCountsR Documentation

Filter low-count exons.

Description

Filter low-count exons from RNA-seq read count data.

Usage

filterLowCounts(rs_data, filter_min_per_exon = 6, filter_min_per_sample = 3)

Arguments

rs_data

RegspliceData object.

filter_min_per_exon

Filtering parameter: minimum number of reads per exon bin, summed across all biological samples. Default is 6.

filter_min_per_sample

Filtering parameter: minimum number of reads per biological sample; i.e. for each exon bin, at least one sample must have this number of reads. Default is 3.

Details

Filters low-count exon bins from RNA-seq read count data. Any remaining single-exon genes (after filtering) are also removed (since differential splicing requires multiple exon bins).

Input data is assumed to be in the form of a RegspliceData object. See RegspliceData for details.

The arguments filter_min_per_exon and filter_min_per_sample control the amount of filtering. Exon bins that meet the filtering conditions are kept. Default values for the arguments are provided; however, these should be adjusted depending on the total number of samples and the number of samples per condition.

After filtering low-count exon bins, any remaining genes containing only a single exon bin are also removed (since differential splicing requires multiple exon bins).

Filtering should be skipped when using exon microarray data. (When using the regsplice wrapper function, filtering can be disabled with the argument filter = FALSE).

Previous step: Filter zero-count exon bins with filterZeros. Next step: Calculate normalization factors with runNormalization.

Value

Returns a RegspliceData object.

See Also

filterZeros runNormalization

Examples

file_counts <- system.file("extdata/vignette_counts.txt", package = "regsplice")
data <- read.table(file_counts, header = TRUE, sep = "\t", stringsAsFactors = FALSE)
head(data)

counts <- data[, 2:7]
tbl_exons <- table(sapply(strsplit(data$exon, ":"), function(s) s[[1]]))
gene_IDs <- names(tbl_exons)
n_exons <- unname(tbl_exons)
condition <- rep(c("untreated", "treated"), each = 3)

rs_data <- RegspliceData(counts, gene_IDs, n_exons, condition)

rs_data <- filterZeros(rs_data)
rs_data <- filterLowCounts(rs_data)


lmweber/regsplice documentation built on March 19, 2024, 1:45 p.m.