funsZTkinv: Z-Test for Cuzick and Edwards T_k^{inv} statistic
In nnspat: Nearest Neighbor Methods for Spatial Patterns

funsZTkinv

R Documentation

Z-Test for Cuzick and Edwards `T_k^{inv}` statistic

Description

Two functions: ZTkinv and ZTkinv.sim, each of which is an object of class "htest" performing a z-test for Cuzick and Edwards T_k^{inv} test statistic. See ceTkinv for a description of T_k^{inv} test statistic.

The function ZTkinv performs a Z-test for T_k^{inv} using asymptotic normality with a simulation estimated variance under RL of cases and controls to the given points. And the function ZTkinv.sim performs test forT_k^{inv} based on MC simulations under the RL hypothesis.

Asymptotic normality for the T_k^{inv} is not established yet, but this seems likely according to \insertCitecuzick:1990;textualnnspat. If asymptotic normality holds, it seems a larger sample size would be needed before this becomes an effective approximation. Hence the simulation-based test ZTkinv.sim is recommended for use to be safe. When ZTkinv is used, this is also highlighted with the warning "asymptotic normality of T_k^{inv} is not yet established, so, simulation-based test is recommended".

All arguments are common for both functions, except for ..., Nvar.sim which are used in ZTkinv only, and Nsim, which is used in ZTkinv.sim only.

The argument cc.lab is case-control label, 1 for case, 0 for control, if the argument case.lab is NULL, then cc.lab should be provided in this fashion, if case.lab is provided, the labels are converted to 0's and 1's accordingly. The argument Nvar.sim represents the number of resamplings (without replacement) in the RL scheme, with default being 1000 for estimating the variance of T_k^{inv} statistic in ZTkinv. The argument Nsim represents the number of resamplings (without replacement) in the RL scheme, with default being 1000 for estimating the T_k^{inv} values in ZTkinv.sim.

Both functions might take a very long time when data size is large or Nsim is large.

See also (\insertCitecuzick:1990;textualnnspat) and the references therein.

Usage

ZTkinv(
  dat,
  k,
  cc.lab,
  alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95,
  case.lab = NULL,
  Nvar.sim = 1000,
  ...
)

ZTkinv.sim(
  dat,
  k,
  cc.lab,
  alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95,
  case.lab = NULL,
  Nsim = 1000
)

Arguments

`dat`	The data set in one or higher dimensions, each row corresponds to a data point, used in both functions.
`k`	Integer specifying the number of the closest controls to subject `i`, used in both functions.
`cc.lab`	Case-control labels, 1 for case, 0 for control, used in both functions.
`alternative`	Type of the alternative hypothesis in the test, one of `"two.sided"`, `"less"` or `"greater"`, used in both functions.
`conf.level`	Level of the upper and lower confidence limits, default is `0.95`, for Cuzick and Edwards `T_k^{inv}` statistic. Used in both functions.
`case.lab`	The label used for cases in the `cc.lab` (if `cc.lab` is not provided then the labels are converted such that cases are 1 and controls are 0), default is `NULL`, used in both functions.
`Nvar.sim`	The number of simulations, i.e., the number of resamplings under the RL scheme to estimate the variance of Tkinv, used in `ZTkinv` only.
`...`	are for further arguments, such as `method` and `p`, passed to the `dist` function. Used in `ZTkinv` only.
`Nsim`	The number of simulations, i.e., the number of resamplings under the RL scheme to estimate the `T_k^{inv}` values, used in `ZTkinv.sim` only.

Value

A list with the elements

`statistic`	The `Z` test statistic for the Cuzick and Edwards `T_k^{inv}` test
`p.value`	The `p`-value for the hypothesis test for the corresponding alternative. In `ZTkinv` this is computed using the standard normal distribution, while in `ZTkinv.sim`, it is based on which percentile the observed `T_k^{inv}` value is among the generated `T_k^{inv}` values.
`conf.int`	Confidence interval for the Cuzick and Edwards `T_k^{inv}` value at the given confidence level `conf.level` and depends on the type of `alternative`.

z-critical values are used in the construction of the confidence interval in ZTkinv, while the percentile values are used in the generated sample of T_k^{inv} values in ZTkinv.sim

`estimate`	Estimate of the parameter, i.e., the Cuzick and Edwards `T_k^{inv}` value.
`null.value`	Hypothesized null value for the Cuzick and Edwards `T_k^{inv}` value which is `k n_1 (n_1-1)/(n_0+1)` under RL, where the number of cases are denoted as `n_1` and number of controls as `n_0`.
`alternative`	Type of the alternative hypothesis in the test, one of `"two.sided"`, `"less"`, `"greater"`
`method`	Description of the hypothesis test
`data.name`	Name of the data set, `dat`

Author(s)

Elvan Ceyhan

References

\insertAllCited

Examples

n<-10 #try also 20, 50, 100
set.seed(123)
Y<-matrix(runif(3*n),ncol=3)
cls<-sample(0:1,n,replace = TRUE)  #or try cls<-rep(0:1,c(10,10))
k<-2

ZTkinv(Y,k,cls)
ZTkinv(Y,k,cls+1,case.lab = 2,alt="l")
#cls as a factor
na<-floor(n/2); nb<-n-na
fcls<-rep(c("a","b"),c(na,nb))
ZTkinv(Y,k,fcls,case.lab="a")

n<-10 #try also 20, 50, 100
set.seed(123)
Y<-matrix(runif(3*n),ncol=3)
cls<-sample(0:1,n,replace = TRUE)  #or try cls<-rep(0:1,c(10,10))
k<-2 # try also 3,5

ZTkinv.sim(Y,k,cls)
ZTkinv.sim(Y,k,cls,conf=.9,alt="g")

#cls as a factor
na<-floor(n/2); nb<-n-na
fcls<-rep(c("a","b"),c(na,nb))
ZTkinv.sim(Y,k,fcls,case.lab="a")

#with k=1
ZTkinv.sim(Y,k=1,cls)
ZTrun(Y,cls)

nnspat documentation built on May 29, 2024, 10:03 a.m.