tStar: Computing t*

Description Usage Arguments Value References Examples

View source: R/Functions.R

Description

Computes the t* U-statistic for input data pairs (x_1,y_1), (x_2,y_2), ..., (x_n,y_n) using the algorithm developed by Heller and Heller (2016) <arXiv:1605.08732> building off of the work of Weihs, Drton, and Leung (2015) <DOI:10.1007/s00180-015-0639-x>.

Usage

1
2
3
tStar(x, y, vStatistic = FALSE, resample = FALSE, numResamples = 500,
  sampleSize = min(length(x), 1000), method = "fastest",
  slow = FALSE)

Arguments

x

A numeric vector of x values (length >= 4).

y

A numeric vector of y values, should be of the same length as x.

vStatistic

If TRUE then will compute the V-statistic version of t*, otherwise will compute the U-Statistic version of t*. Default is to compute the U-statistic.

resample

If TRUE then will compute an approximation of t* using a subsettting approach: samples of size sampleSize are taken from the data numResample times, t* is computed on each subsample, and all subsample t* values are then averaged. Note that this only works when vStatistic == FALSE, in general you probably don't want to compute the V-statistic via resampling as the size of the bias depends on the sampleSize irrespective numResamples. Default is resample == FALSE so that t* is computed on all of the data, this may be slow for very large sample sizes. Resampling can only be used when the method argument is using its default.

numResamples

See resample variable description for details, this value is ignored if resample == FALSE (ignored by default).

sampleSize

See resample variable description for details, this value is ignored if resample == FALSE (ignored by default).

method

which method to use to compute the statistic. Default is "fastest" which uses the fastest available method (currently "heller"). The options are "heller" described in Heller and Heller (2016), "weihs", using the algorithm from Weihs et al. (2015), and "naive" using a naive algorithm.

slow

a deprecated option kept for backwards compatability. If TRUE then will override the method parameter and compute the t* statistic using a naive O(n^4) algorithm.

Value

The numeric value of the t* statistic.

References

Bergsma, Wicher; Dassios, Angelos. A consistent test of independence based on a sign covariance related to Kendall's tau. Bernoulli 20 (2014), no. 2, 1006–1028.

Heller, Yair and Heller, Ruth. "Computing the Bergsma Dassios sign-covariance." arXiv preprint arXiv:1605.08732 (2016).

Weihs, Luca, Mathias Drton, and Dennis Leung. "Efficient Computation of the Bergsma-Dassios Sign Covariance." arXiv preprint arXiv:1504.00964 (2015).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## Not run: 
library(TauStar)

# Compute t* for a concordant quadruple
tStar(c(1,2,3,4), c(1,2,3,4)) # == 2/3

# Compute t* for a discordant quadruple
tStar(c(1,2,3,4), c(1,-1,1,-1)) # == -1/3

# Compute t* on random normal iid normal data
set.seed(23421)
tStar(rnorm(4000), rnorm(4000)) # near 0

# Compute t* as a v-statistic
set.seed(923)
tStar(rnorm(100), rnorm(100), vStatistic = TRUE)

# Compute an approximation of tau* via resampling
set.seed(9492)
tStar(rnorm(10000), rnorm(10000), resample = TRUE, sampleSize = 30,
      numResamples = 5000)

## End(Not run)

TauStar documentation built on May 1, 2019, 9:59 p.m.