Description Usage Arguments Value DETAILS Author(s) See Also Examples
Statistical tests for intergenic distance data
1 |
GeneSetDistances |
A |
Universe |
Character string indicating which set should be considered as the universe (or control set) |
MedianResample |
Logical. Should the resample test of the median be performed (defaults to TRUE) |
R |
integer giving the number of resampling to perform for the resampling test (Default to 1e4) |
A tibble with the following columns:
GeneSet. Name of the gene set.
Orientation. Orientation of the neighbor (same or opposite strand).
Side. Upstream or Downstream.
KS.pvalue. p-value of Kolmogorov-Smirnof test
Wilcox.pvalue. p-value of Wilcoxon rank sum test (or Mann-Whitney U test).
Independ.pvalue. p-value of the independance test
if MedianResample
is TRUE
the tibble will also contain this additional column:
Resample.pvalue. p-value from the resampling test.
The following tests are possible:
Kolmogorov-Smirnov test. See ks.test
.
Although not adapted to integer values, it gives conservative
p-values for large enough gene sets.
Wilcoxon rank sum test (or Mann-Whithney U test).
See wilcox.test
.
Independence test. See independence_test
in the coin
package
resample. A test based on random resampling of the universe distances.
Pascal GP Martin
ks.test
,
wilcox.test
,
independence_test
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | #' ## Obtain gene neighborhood information:
GeneNeighbors <- getGeneNeighborhood(Genegr)
## Get a (random) set of (100) genes:
set.seed(123)
randGenes <- sample(names(Genegr), 100)
## Create a set enriched for close upstream genes:
GenePool <- GeneNeighbors[!is.na(GeneNeighbors$UpstreamDistance),]
Proba <- (max(GenePool$UpstreamDistance)-GenePool$UpstreamDistance) /
sum(max(GenePool$UpstreamDistance)-GenePool$UpstreamDistance)
Proba <- (1/(GenePool$UpstreamDistance+1)) / sum(1/(GenePool$UpstreamDistance+1))
CloseUpstream <- sample(GenePool$GeneName, size = 100, prob = Proba)
## Extract distances for this set of genes and for all genes :
myGeneSets <- list("RandomGenes" = randGenes,
"CloseUpstream" = CloseUpstream,
"AllGenes" = GeneNeighbors$GeneName)
distForGeneSets <- dist2Neighbors(GeneNeighbors,
myGeneSets)
## Compare distances for genesets to a control set (here "AllGene")
## (using only 1K permutations fo speed purposes here, prefer using > 1e4)
distTests(distForGeneSets,
Universe="AllGenes",
MedianResample = TRUE,
R=1e3)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.