rankProductDiffExpress: Rank Product across multiple Samples

View source: R/rankProduct.R

rankProductDiffExpressR Documentation

Rank Product across multiple Samples

Description

Use the Rank Product method to find differentially expressed genes, or prioritize the order a set of genes.

Usage

rankProduct(rankM, nSimulations = 500)

rankProductDiffExpress(fnames, groupSet, targetGroup = groupSet[1], 
		geneColumn = "GENE_ID", intensityColumn = "INTENSITY", 
		productColumn = "PRODUCT", offset = 0, keepIntergenics = FALSE, 
		average.FUN = logmean, poolSet = rep(1, length(fnames)), 
		nSimulations = 500, missingGenes = c("drop", "fill"))

Arguments

rankM

numeric matrix of gene ranks, with GeneIDs as the rownames and SampleIDs as the column names

fnames

character vector of full pathnames to existing transcriptome files

groupSet

character vector of GroupIDs or conditions, to categorize the transcripts

targetGroup

the one GroupID to be the chosen subset, to compare all other groups against. This is the group that is being tested for up-regulation.

geneColumn

column name of the column of GeneIDs

intensityColumn

column name of the column of intensity values

productColumn

column name of the column that has gene product descriptions

offset

a linear offset to add to all intensity values to prevent divide by zero and/or extreme fold change ratios

keepIntergenics

logical, explicity keep the non-genes, or drop them from consideration

average.FUN

the averaging function for combining gene intensities within subset groups, gene ranks, and gene RP values

poolSet

numeric vector of sample pools or tiers. For restricting 2-sample DE tests to samples from comparable tiers. See details.

nSimulations

number of simulations of random permutations of the data, for calculating false positives rates.

missingGenes

method for dealing with genes that are not present in every transcript file. Either drop entire gene rows, or fill in with minimum observed intensity.

Details

This function implements the Rank Product algorithm of Breitling, et.al. By performing all possible 2-sample DE comparisons and ranking genes by fold change, this calculates a family of rank positions for each gene. Turning those ranks into probabilities of differential expression, the algorithm assigns a measure called Rank Product (RP), as the likelihood a gene could be that high in rank across that many DE comparisons.

For the simple case of rankProduct, given a matrix of ranks, the algorithm just measures RP and estimates the false positive rates.

By default, each sample is compared to all other samples that are not from its group. If more restrictions are warranted, the poolSet argument can be used to assign a pool or tier to each sample; whereby only samples from the same pool but coming from different groups go forward into the 2-sample tests.

Value

For rankProduct, a data frame with RP values, average ranks, and false positive rates for each gene are returned, in the same row order as the input matrix.

For rankProductDiffExpress, a data frame of consensus gene differential expression, sorted by RP value, with columns:

GENE_ID

the genes, sorted from most up-regulated to most down-regulated

PRODUCT

the gene product descriptions

LOG_2_FOLD

the average fold change for each gene

RP_VALUE

the Rank Product value. See calcRP

AVG_RANK

the average rank position over all 2-sample DE tests, for each gene

AVG_SET1

the average gene intensity over all samples in group targetGroup

AVG_SET2

the average gene intensity over all samples in the other groups

E_VALUE

the expected number of genes to have an RP value this good, by chance

FP_RATE

the rate of false positive DE genes, given this RP value

Note

Typically, this function is called once for each group, to get all possible DE comparisons between the various groups. While the function explicitly measures up-regulation, by reversing the order of the rows of the result, you get the answer for down-regulation.

Author(s)

Bob Morrison

References

Rainer Breitling, et.al. FEBS Letters 573 (2004)


robertdouglasmorrison/DuffyTools documentation built on May 6, 2024, 8:26 p.m.