In kangpingy/rank.test: What the Package Does (One Line, Title Case)

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>")

library(rank.tests)
library(bench)
library(graphics)

Discription

The package "rank.tests" contain two functions, one of which is to mimic rank() function in the base package, which would be useful in the following rank test. Rank test, specifically the Wilcoxon signed-rank test and the Mann–Whitney U test are nonparametric tests of the null hypothesis that samples from the two groups are different from each other or if one group of the sample median is different from a specific number (could be assigned, by default is zero). This test do not need to assume any distribution of the interested groups which would be quite easy to use especially in small sample tests. In large samples, the rank tests statistics is approximately follows normal distribution.

My_rank(x) example

x <- c(4,2,6,4,9,1,4,6)
my_rank(x)
rank(x)

As we could see from above, my_rank function perfectly mimic rank function in base package. For numbers are the same, the rank of these numbers are all the same at their average rank.

Mann_Whitney_U(x,y=NULL,median_test = 0, paired = F)

Another one is "Mann_Whitney_U". It can carry out both Wilcoxon signed-rank test (for one numeric vector or two comparing numeric vectors) and Mann–Whitney U test (for two different unpaired samples).

Here is one example of Mann-Whitney U test with small samples

  x <- c(1:5)
  y <- rnorm(5)*2.5+2.5
  Mann_Whitney_U(x,y)
  wilcox.test(x,y,correct = F)  ##R function carry out the test in 'stats' package

Another example is Wilcoxon signed rank test for one sample to test if the sample median (not the mean) is significantly different from zero:

  x <- rnorm(200)+0.01
  Mann_Whitney_U(x)
  wilcox.test(x,correct = F)

It could also test whether there is a difference between two paired samples (i.e. A group of patients' blood pressure before and after treatment) by using 'paried = T' argument(Note: the sample size of X and Y should be equal in this case):

  x <- rchisq(108,2)
  y <- rchisq(108,2)+0.001
  Mann_Whitney_U(x,y,paired = T)
  wilcox.test(x,y,correct = F,paired = T)

You can also change the compared median to number other than 0, for both wilcox signed rank test and paired two sample rank test using 'median_test =' argument: (By default, median_test = 0 if not specified)

  x <- rchisq(108,2)
  y <- rchisq(108,2)+0.01
  Mann_Whitney_U(x,median_test = 0.01)
  wilcox.test(x,correct = F, mu = 0.01)
  Mann_Whitney_U(x,y,paired = T,median_test = 0.01)
  wilcox.test(x,y,correct = F,paired = T, mu = 0.01)

The result is obvious that x sample median is not 0.01, and x and y sample's difference in median would approximately be 0.01 (p>0.05 in the second result which have the same report value as the wilcox.test() in stat package).

Efficientcy when compared with wilcox.test() in stats package

We may compare the running speed between my function and that one already have:

  x <- rchisq(100,3)
  y <- rchisq(100,2)+0.01
  mark(Mann_Whitney_U(x,y)[[2]], wilcox.test(x,y,correct = F)[[3]])
  ##a <- mark(Mann_Whitney_U(x,y)[[2]], wilcox.test(x,y,correct = F)[[3]])
  ##plot(a) ## Compare both correctness and speed when cacluting test statistics p value. 
  ## The above two line R code 'plot()' could only run in R studio, is having trouble in building packages in travis. So I preserve the code with '##'

As we could see from the brench::mark() result, my function implemented in my package is about twice as fast as the wilcox.test in stats package when total sample size is at hunderds sample size level.

kangpingy/rank.test documentation built on Dec. 7, 2019, 12:19 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com