Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function implements the Easy Ensemble, together with the Mann-Witney-Wilcox test, to detect the genes associated with few samples (minority set) being a subset of a larger collection of samples (majority set).
1 | eeMWW(ddata, minoritySet, runs = 1000)
|
ddata |
a matrix where the samples are by rows and the features are in the columns. |
minoritySet |
a character vector of the minority set matching some row names of ddata. |
runs |
number of resampling. |
The EasyEnsemble (EE) resampling scheme is an Undersampling technique aimed to compare few samples (minority set), carrying some phenotype, to a larger collection of samples (majority set) unrelated with the phenotype. We implement the EE with the Mann-Whitney-Wilcoxon test (MWW) to compare the minority set, of dimension m, with a randomly selected collection of 2*m samples from the majority set.
a named vector of real values.
We suggest running the function in a parallel setup.
Stefano M. Pagnotta
Xu-Ying Liu, Jianxin Wu, and Zhi-Hua Zhou - Exploratory Undersampling for Class-Imbalance Learning - IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS?PART B: CYBERNETICS, VOL. 39, NO. 2, APRIL 2009
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | require(yaGST)
nr <- 100; nc <- 1000
# generate a data-matrix with nr samples, and nc features
exprData <- matrix(rpois(nc * nr, 100), nrow = nr, ncol = nc)
colnames(exprData) <- paste0("feat", 1:nc)
rownames(exprData) <- paste0("sam", 1:nr)
# increase the first 3 samples (minority set) of 10% of the original intensity
# of the first 30 features (later the gene-set)
exprData[1, 1:30] <- exprData[1, 1:30]* runif(30, min = 1, max = 1.10)
exprData[2, 1:30] <- exprData[1, 1:30]* runif(30, min = 1, max = 1.10)
exprData[3, 1:30] <- exprData[1, 1:30]* runif(30, min = 1, max = 1.10)
samples_of_interest <- rownames(exprData)[1:3] # minority set
# running in parallel
library(doParallel)
# adjust the number of CPUs as needed
cl <- makePSOCKcluster(3)
clusterApply(cl, floor(runif(length(cl), max = 10000000)), set.seed)
registerDoParallel(cl)
ans_eeMWW <- eeMWW(exprData, samples_of_interest)
stopCluster(cl)
# set the gene-set and run the enrichment analysis
geneSet <- colnames(exprData)[1:30]
(tmp <- mwwGST(ans_eeMWW, geneSet))
plot(tmp, rankedList = ans_eeMWW)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.