Transform Affymetrix data so that unique genes with multiple probes are represented by a single expression value on each array.

Description

In Affymetrix gene expression data, a unique gene can often link to multiple probe sets, with such genes then having a greater influence on the analysis (particularly if the gene is differentially expressed). To overcome this problem the median is taken across all probes sets which represent a unique gene.

Usage

1
aveProbe(x, imat = NULL, ids)

Arguments

x

A matrix with no missing values; Each row represents a gene and each column represents a sample.

imat

A matrix indicating presence or absence of genes in the gene sets. The indicator matrix contains rows representing gene identifiers of genes present in the expression data and columns representing group (gene set) names.

ids

A vector of identifiers (e.g., UniGene or LocusLink identifiers) representing unique genes which match to the probe ids in the expression data.

Value

newx

A data matrix with rows representing the input identifiers and columns representing samples.

newimat

A new imat (indicator matrix) with rows representing the unique gene identifiers and columns representing gene sets.

Author(s)

Sarah Song and Mik Black

See Also

pcot2,corplot,corplot2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
library(multtest)
library(hu6800.db)  
data(golub)
rownames(golub) <- golub.gnames[,3]
colnames(golub) <- golub.cl
KEGG.list <- as.list(hu6800PATH)
imat <- getImat(golub, KEGG.list, ms=10) 
colnames(imat) <- paste("KEGG", colnames(imat), sep="")


pathlist <- as.list(hu6800PATH)
pathlist <- pathlist[match(rownames(golub), names(pathlist))]
ids <- unlist(mget(names(pathlist), env=hu6800SYMBOL))
#### transform data matrix only ####
newdat <- aveProbe(x=golub, ids=ids)$newx
#### transform both data and imat ####
output <- aveProbe(x=golub, imat=imat, ids=ids)
newdat <- output$newx
newimat <- output$newimat
newimat <- newimat[,apply(newimat, 2, sum)>=10]