Description Usage Arguments Details Value Author(s) Examples
View source: R/aveytoolkit_collapseDataset.R
Collapses a dataset from probes to gene symbols.
1 2 3 4 5 6 7 8 9 10 11 |
exprsVals |
a matrix or data.frame of numeric values with rownames denoting the identifiers. |
platform |
the microarray platform the data comes from for extracting the gene symbols |
mapVector |
a uniquely named character vector with names specififying the current identifiers (probes matching the rownames of exprsVals) and the values of the vector specifying the gene symbols (or other identifier to collapse to). |
oper |
the operation used to choose which probe when multiple probes map to the same gene. Default is max which will calculate the maximum of the average. |
prefer |
one of "none", "up", or "down", can be abbreviated. |
singleProbeset |
If |
returnProbes |
if |
deProbes |
a list with named vectors "up" and "down" giving the names of up and downregulated probes |
debug |
When TRUE, things will be printed out to help debug errors |
This function is designed to work for microarray data but can work for any sort of numeric matrix for which multiple rows need to be collapsed. The aggregate
function would probably work better and speed this up but this code is the slow brute force way to do it.
If singleProbeset is set to FALSE
, the default for compatability reasons but untested and not recommended, the values for each sample will be taken from the maximum across any probe that maps to that gene. This means that a gene's expression values may be a composition of values from different probes rather than a single probe. Most users will not need to use the 'prefer' argument. If prefer is "up", when multiple deProbes match the same gene, the upregulated will be chosen. Similary for "down". Default is "none" and the probe with the 'oper' (default max) will be chosen.
Note that it is possible for multiple probes to have the same operation (oper
) over all conditions and, in this case, I've decided arbitarily to choose the first one.
If returnProbes is TRUE
, a list containing the collapsed dataset in $exprsVals and the probes chosen in $probeSets. Otherwise, if returnProbes is FALSE
, only the expression matrix is returned.
Christopher Bolen, Modified by Stefan Avey
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ## Trivial Example showing basic functionality
fakeExpr <- matrix(rnorm(50, mean=8, sd=1), ncol=5, nrow=10,
dimnames=list(probes=paste("probe", 1:10, sep='_'),
samples=paste("sample", LETTERS[1:5], sep='_')))
mv <- rep(paste("Gene", LETTERS[1:5], sep='_'), each=2) # mapVector
names(mv) <- rownames(fakeExpr)
res <- collapseDataset(fakeExpr, mapVector=mv, oper=max,
singleProbeset=TRUE, # recommend setting singleProbeset to TRUE
returnProbes=TRUE)
res$probes
## between probe_1 and probe_2, probe_2 was chosen for Gene_A
## between probe_3 and probe_4, probe_4 was chosen for Gene_B
## etc.
res$exprsVals # collapsed expression values
## only difference is in rownames, numbers are identical
all.equal(res$exprsVals, fakeExpr[res$probes,])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.