Description Usage Arguments Details Value Author(s) See Also Examples
this will replace edd.unsupervised; has more sensible parameters
1 2 3 | edd(eset, distList=eddDistList, tx=c(sort,flatQQNormY)[[1]],
refDist=c("multiSim", "theoretical")[1],
method=c("knn", "nnet", "test")[1], nRowPerCand=100, ...)
|
eset |
eset – instance of Biobase
|
distList |
distList – list comprised of eddDist objects |
tx |
tx – transformation of data and reference prior to classification |
refDist |
refDist – type of reference distribution system to use |
method |
method – type of classifier to use. knn is k-nearest neighbors, nnet is neural net, test is max p-value from ks.test |
nRowPerCand |
nRowPerCand – number of realizations for a multiSim reference system |
... |
... – parameters to classifiers |
Classifies genes according to distributional shape, by comparing observed expression distributions to a collection of references, which may be simulated or evaluated theoretically.
The distList argument is important. It enumerates the catalog of distributions for classification of gene expression vectors by distributional shape. See the HOWTO-edd vignette for information on how this list is constructed and how it can be extended.
The tx argument specifies how the data are processed for comparison to the reference catalog. This is a function on a vector returning a vector, but the input and the output need not have the same length. The default value of tx is sort, which entails that the order statistics are treated as multivariate data for classification.
The refDist argument selects the type of reference catalog. Options are 'multiSim', for which the reference consists of nRowPerCand realizations of each catalog entry, and 'theoretical', for which the reference consists of one vector of quantiles for each catalog entry.
The method argument selects the type of classifier.
It would be desirable to allow this to be a function,
but there is insufficient structure on classifier
argument and return value structure to permit this
at present; see the e1071 package for some work
on handling various classifiers programmatically
(e.g., tune
).
a character vector or factor depending on the classifier
Vince Carey <stvjc@channing.harvard.edu>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | require(Biobase)
data(sample.ExpressionSet)
# should filter to genes with reasonable variation
table( edd(sample.ExpressionSet, meth="nnet", size=10, decay=.2) )
library(golubEsets)
data(Golub_Merge)
madvec <- apply(exprs(Golub_Merge),1,mad)
minvec <- apply(exprs(Golub_Merge),1,min)
keep <- (madvec > median(madvec)) & (minvec > 300)
gmfilt <- Golub_Merge[keep==TRUE,]
ALL <- gmfilt$ALL.AML=="ALL"
gall <- gmfilt[,ALL==TRUE]
gaml <- gmfilt[,ALL==FALSE]
alldists <- edd(gall, meth="nnet", size=10, decay=.2)
amldists <- edd(gaml, meth="nnet", size=10, decay=.2)
table(alldists,amldists)
amldists2 <- edd(gaml, meth="nnet", refDist="theoretical", size=10, decay=.2)
table(amldists,amldists2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.