gather_counts: Gather counts from 'fpkm' or 'raw' object

Description Usage Arguments Details Value See Also Examples

Description

gather_counts takes an object of class fpkm or raw and returns a new object of the class fpkm_counts or raw_counts, which inherits from fpkm/raw respectively. An extra counts column is added to the input object and returned. The counts column is a list of data.tables.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
gather_counts(x, ...)

## Default S3 method:
gather_counts(x, ...)

## S3 method for class 'fpkm'
gather_counts(x, by = c("gene-id", "gene-name",
  "transcript-id"), threshold = 0.1, log_base = FALSE, verbose = FALSE,
  ...)

## S3 method for class 'raw'
gather_counts(x, by = c("gene-id", "gene-name"),
  threshold = 1L, log_base = FALSE, verbose = FALSE, ...)

Arguments

x

An object of class fpkm or raw. See ?rnaseq.

...

Additional arguments to be passed to or from other methods.

by

Level at which reads should be aggregated to (if necessary). There are three possible values:

"gene-id" (default): The column "gene_id" must be available in the raw or fpkm counts file. For fpkm, values from all transcripts corresponding to each gene id are summed together.

"gene-name": The column gene_short_name must be available in the raw or fpkm counts file. In some cases, it might be necessary to use gene name instead of id. For fpkm, values from all transcripts corresponding to each gene name are summed together instead.

"transcript-id": Only valid for fpkm counts. The values are retained as such, as they already contain the expression associated with each transcript. The columns tracking_id and gene_id must be available in the fpkm counts file.

threshold

In case of fpkm objects, features whose row means >= threshold will alone be retained. In case of raw objects, features whose rpkm values >= threshold will alone be retained. Default value is 0.1 for fpkm and 1 for raw objects respectively.

Note that threshold is applied on fpkm or raw counts, not their log transformed values.

log_base

Value to pass to the base argument of log. If FALSE (default), the values are not log transformed. TRUE defaults to base=2.

In case of raw_counts, it is recommended that the raw counts are not log transformed. Some plotting functions (e.g., spectral maps) might be better on log transformed values. The argument is exposed for those cases.

In case of fpkm_counts, the recommendation is to log transform the values before using limma_dge.

verbose

Logical. Default is FALSE. If TRUE, sends useful status messages to the console.

Details

gather_counts is an S3 generic with methods implemented for both fpkm and raw objects.

In case of fpkm objects, the fpkm values are assumed to be generated by cufflinks. The argument by provides the the level at which differential expression has to be computed, since it contains fpkm counts for all expressed isoforms. See details for possible values for by.

In case of raw, the most common analysis is differential gene expression. Transcript level read counts are not possible (or makes very less sense) with raw counts. See get_counts function from gcount package for more.

Value

A new object of class fpkm_counts or raw_counts corresponding to fpkm or raw objects respectively.

See Also

rnaseq limma_dge edger_dge as.eset show_counts construct_design construct_contrasts write_dge volcano_plot density_plot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
path = system.file("tests", package="ganalyse")

# ----- fpkm ----- # 
fpkm_path = file.path(path, "fpkm", "annotation.txt")
fpkm_obj = rnaseq(fpkm_path, format="fpkm", experiment="sample")
(fpkm_counts = gather_counts(fpkm_obj, by="gene-id", log_base=2L))
class(fpkm_counts)

# ----- raw ----- # 
raw_path = file.path(path, "raw", "annotation.txt")
raw_obj = rnaseq(raw_path, format="raw", experiment="sample")
(raw_counts = gather_counts(raw_obj, by="gene-id", threshold=1L))
class(raw_counts)

asrinivasan-oa/ganalyse documentation built on May 12, 2019, 5:38 a.m.