Description Usage Arguments Details Value Pvalue Ties Big Data See Also
An internal function unifying several nonparametric tests for paired samples.
1 2 3 4 5 6 7 8 9 10 11 12 13  .generic.rank.test(
xs,
ys,
test,
letter,
description,
na.rm = TRUE,
collisions = TRUE,
precision = 1e05,
limit_law_coef = 1,
min_samples = 1,
max_samples = Inf
)

xs, ys 
Samelength numeric vectors, containing paired samples. 
test 
Function computing the test statistic given a relative order. 
letter 
Notation for the test statistic, e.g., "D" for Hoeffding's D. 
description 
Full name of test. 
na.rm 
Logical: Should missing values, 
collisions 
Logical: Warn of repeating values in 
precision 
of pvalue, between 0 and 1. Otherwise pvalue= 
limit_law_coef 
Scaling of test statistic for standard null distribution. 
min_samples, max_samples 
Data size limits. 
The function .generic.rank.test
first calls
relative.ordering
with xs
and ys
.
Then it uses the given function to compute the test statistic
from the resulting permutation.
The statistic is rescaled by multiplication with
(n1)*limit_law_coef
, where n
is the sample size.
Finally, it computes the pvalue by calling
pHoeffInd
from the package TauStar
.
A list, of class "indtest"
:
method  the test's name 
n  number of data points used 
Tn /Dn /Rn /...  the test statistic, measure of dependence 
scaled  the test statistic rescaled for a standard null distribution 
p.value 
the asymptotic pvalue, by TauStar::pHoeffInd 
The null distribution of the test statistic was described by Hoeffding.
The pvalue is approximated by calling the function
pHoeffInd
from the package TauStar
by
Luca Weihs.
By default, the pvalue's precision
parameter is set to 1e5
.
It seems that better precision would cost a considerable amount of time,
especially for large values of the test statistic.
It is therefore recommended to modify this parameter only upon need.
In case that TauStar
is unavailable, or to save time in repeated use,
set precision = 1
to avoid computing pvalues altogether.
The scaled
test statistic may be used instead.
Its asymptotic distribution does not depend on any parameter.
Also the raw test statistic may be used, descriptively,
as a measure of dependence.
Only its accuracy depends on the sample size.
This package currently assumes that the variables under consideration are nonatomic, so that ties are not expected, other than by occasional effects of numerical precision. Addressing ties rigorously is left for future versions.
The flag collisions = TRUE
invokes checking for ties in xs
and in ys
, and produces an appropriate warning if they exist.
The current implementation breaks such ties arbitrarily, not randomly.
By the averaging nature of the test statistic, it seems that a handful of ties should not be of much concern. In case of more than a handful of ties, our current advice to the user is to break them uniformly at random beforehand.
The test statistic is computed in almost linear time, O(n log n), given a sample of size n. Its computation involves integer arithmetics of order n^4 or n^5, which should fit into an integer data type supported by the compiler.
Most 64bit compilers emulate 128bit arithmetics. Otherwise we use the standard 64bit arithmetics. Find the upper limits of your environment using
max_taustar()
max_hoeffding()
Another limitation is 2^311, the maximum size and value of
an integer vector in a 32bit build of R.
This is only relevant for the tau star statistic in 128bit mode,
which could otherwise afford about three times that size.
If your sample size falls in this range, try recompiling the
function .calc.taustar
according to the instructions in the cpp source file.
independence
,
relative.order
,
tau.star.test
,
hoeffding.D.test
,
hoeffding.refined.test
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.