CSS_Norm: Cumulative sum scaling normalization of count data

Description Usage Arguments Details Value Author(s) References Examples

Description

The method normalizes count data by cumulative sum scaling, using a specified quantile (e.g., 75th quantile)

Usage

1
CSS_Norm(e_data, edata_id, q = 0.75, qg = "median")

Arguments

e_data

a p \times n data.frame of count data, where p is the number of features and n is the number of samples. Each row corresponds to data for a feature, with the first column giving the feature name.

edata_id

character string indicating the name of the feature identifier. Usually obtained by calling attr(omicsData, "cnames")$edata_cname.

q

a number to indicate which quantile to normalize with. Default is 0.75.

qg

a number to use for scaling all samples. Default is 1000 (as in reference). Can also specify "median" to use the median value scaling value across all samples.

Details

Count data is normalized by a given quantile, dividing by the sum of all values up to and including the given quantile of the sample and multiplying by the given scaling value (either 1000 or the median scaling value across all samples).

Value

List containing 3 elements: norm_data is a data.frame with same structure as e_data that contains the quantile-normalized data, location_param is NULL, and scale_param is a numeric vector containing, for every sample, the value of the sample at the designated quantile (q) divided by the value of the global quantile (q).

Author(s)

Allison Thompson, Lisa Bramer

References

Paulson, Joseph N, O Colin Stine, Hector Corrada Bravo, and Mihai Pop. "Differential abundance analysis for microbial marker-gene surveys." Nature Methods. 10.12 (2013)

Examples

1
2
3
4
5
6
7
## Not run: 
library(mintJansson)
data(rRNA_data)
rRNA_CSS <- CSS_Norm(e_data = rRNA_data$e_data, edata_id = attr(rRNA_data, "cnames")$edata_cname)
norm_factors <- attr(rRNA_CSS,"data_info")$scale_param

## End(Not run)

pmartR/pmartRseq documentation built on May 25, 2019, 9:20 a.m.