fasano.franceschini.test  R Documentation 
Performs a twosample multidimensional KolmogorovSmirnov test as described by Fasano and Franceschini (1987). This test evaluates the null hypothesis that two i.i.d. random samples were drawn from the same underlying probability distribution. The data can be of any dimension, and can be of any type (continuous, discrete, or mixed).
fasano.franceschini.test( S1, S2, nPermute = 100, threads = 1, seed = NULL, p.conf.level = 0.95, verbose = TRUE, method = c("r", "b") )
S1 

S2 

nPermute 
A nonnegative 
threads 
A positive 
seed 
An optional integer to seed the PRNG used for the permutation test. A seed must be passed to reproducibly compute pvalues. 
p.conf.level 
Confidence level for the confidence interval of the permutation test pvalue. 
verbose 
A 
method 
An optional 
The test statistic can be computed using two different methods. Both methods return identical results, but have different time complexities:
Range tree method: This method has a time complexity of O(N*log(N)^(d1)), where N is the size of the larger sample and d is the dimension of the data.
Brute force method: This method has a time complexity of O(N^2).
The range tree method tends to be faster for low dimensional data or large
sample sizes, while the brute force method tends to be faster for high
dimensional data or small sample sizes. When method
is not passed,
the sample sizes and dimension of the data are used to infer which method will
likely be faster. However, as the geometry of the samples can greatly influence
computation time, the method inferred to be faster may not actually be faster. To
perform more comprehensive benchmarking for a specific dataset, nPermute
can be set equal to 0
, which bypasses the permutation test and only
computes the test statistic.
The pvalue for the test is computed empirically using a permutation test. As it is almost always infeasible to compute the exact permutation test pvalue, a Monte Carlo approximation is made instead. This estimate is a binomially distributed random variable, and thus a confidence interval can be computed. The confidence interval is obtained using the procedure given in Clopper and Pearson (1934).
A list with class htest
containing the following components:
statistic 
The value of the test statistic D. 
estimate 
The value of the difference statistics D1 and D2. 
p.value 
The permutation test pvalue. 
conf.int 
A binomial confidence interval for the pvalue. 
method 
A character string indicating what type of test was performed. 
data.name 
A character string giving the names of the data. 
Fasano, G. & Franceschini, A. (1987). A multidimensional version of the KolmogorovSmirnov test. Monthly Notices of the Royal Astronomical Society, 225:155170. doi: 10.1093/mnras/225.1.155.
Clopper, C. J. & Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26, 404–413. doi: 10.2307/2331986.
set.seed(0) # create 2D samples S1 < data.frame(x = rnorm(n = 20, mean = 0, sd = 1), y = rnorm(n = 20, mean = 1, sd = 2)) S2 < data.frame(x = rnorm(n = 40, mean = 0, sd = 1), y = rnorm(n = 40, mean = 1, sd = 2)) # perform test fasano.franceschini.test(S1, S2) # perform test with more permutations fasano.franceschini.test(S1, S2, nPermute = 150) # set seed for reproducible pvalue fasano.franceschini.test(S1, S2, seed = 0)$p.value fasano.franceschini.test(S1, S2, seed = 0)$p.value # change confidence level for pvalue confidence interval fasano.franceschini.test(S1, S2, p.conf.level = 0.99) # perform test using range tree method fasano.franceschini.test(S1, S2, method = 'r') # perform test using brute force method fasano.franceschini.test(S1, S2, method = 'b') # perform test using multiple threads to speed up pvalue computation ## Not run: fasano.franceschini.test(S1, S2, threads = 2) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.