probeSummarization: Gene-level expression summarization

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/probeSummarization.R

Description

Summarize probe-set-level expression matrix into gene-level expression matrix.

Usage

1
2
probeSummarization(ge, map, method="corr", threshold=0.5, gene.colname="Gene.Symbol", verbose=TRUE)
summarizeGenes(ge, map, sumfun=median)

Arguments

ge

Gene expression matrix with each row as a gene and each column as a sample.

map

Path to gene symbol annotation file with rownames as probe set IDs. Must contains a column with gene symbols.

method

Methods to evaluate the association between probe sets of the same gene.corr for Pearson correlation coefficients. mi for mutual information.

threshold

Threshold below which the probe set will not be used for summarization. Default = 0.5

gene.colname

The column name in the map file that contains the gene symbols.

verbose

If TRUE, show the summarization process.

sumfun

Summarization function used for simple summarization.

Details

When running attractor finding program, it is important to summarize probe-set-level expression into gene-level expression. It can prevent the genes with multiple probe sets in the microarray dominate the rank in the attractors. probeSummarization achieves this by taking mean values of probe sets of the same gene while discarding the probe set with significantly different expression pattern. These 'bad probe sets' were identified by calculating the association between every the probe set with the sum of the probe sets. If the association is less than the threshold, the probe set was discarded. The remaining probe sets were summarized by their mean values.

summarizeGenes simply summarize the probe sets by applying the user-defined function sumfun. The two functions are expected to be integrated in the future release.

Value

An gene-level expression matrix with genes at the rows and samples at the columns.

Note

Two functions are expected to be integrated in the future release, in which summarizeGenes will be obsolete.

Author(s)

Wei-Yi Cheng

References

Wei-Yi Cheng, Tai-Hsien Ou Yang and Dimitris Anastassiou, Biomolecular events in cancer revealed by attractor metagenes, PLoS Computational Biology, Vol. 9, Issue 2, February 2013.

See Also

findAttractor, createMetageneSpace

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# load Toy version of Wang et al. breast cancer dataset (GSE2034)
data(brca.pbs)

# download the HGU133A 2.0 annotations
source("http://bioconductor.org/biocLite.R")
biocLite("hgu133a2.db")
library(hgu133a2.db)

# Create map object to fit the format
x <- hgu133a2SYMBOL
map <- cbind(unlist(as.list(x[mappedkeys(x)])))
colnames(map) <- "Gene.Symbol"

# summarize into gene-level expression after eliminating uncorrelated probes
brca <- probeSummarization(brca.pbs, map)

# summarize into gene-level expression using median (default)
brca <- summarizeGenes(brca.pbs, map)

weiyi-bitw/cafr documentation built on May 4, 2019, 4:18 a.m.