testforDEP: Test dependence for two data
In testforDEP: Dependence Tests for Two Variables

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/1testforDEP.R

This function computes test statistic, p value, and confidence interval for dependence based on classic methods: Pearson, Kendall, Spearman, and modern methods: Vexler, Kallenberg, MIC, Hoeffding, and Empirical Likelihood tests.

1 2	testforDEP(x = NA, y = NA, data = NA, test, p.opt = "MC", num.MC = 10000, BS.CI = 0, rm.na = FALSE, set.seed = FALSE)

`x`	a numeric vector stores first variable.
`y`	numeric vector stores second variable.
`data`	(Optional) a data frame stores data to be tested.
`test`	a character indicating which test to implement.. Must be one of {"PEARSON", "KENDALL", "SPEARMAN", "VEXLER", "TS2", "V", "MIC", "HOEFFD", "EL"}
`p.opt`	a character specifying p value to be obtained by distribution or by Monte Carlo simulation. Must be "dist", "MC" or "table".
`num.MC`	a numeric for number of Monte Carlo simulations.
`BS.CI`	a numeric specifying alpha for Bootstrap confidence interval. When equal 0, confidence interval won't be computed.
`rm.na`	a TRUE/ FALSE flag indicating whether remove missing data (NA) in input.
`set.seed`	a TRUE/ FALSE flag indicating whether set seed for Monte Carlo simulation and bootstrap sampling.

Argument "x, y" and "data" are two different ways to input data. When x or y is missing, data will be taken as input; while x, y and data all exist leads to error. Argument data is a two-column numeric data frame. The order of columns does not affect results. Since modern test methods: "VEXLER", "TS2", "V", "MIC", "HOEFFD", and "EL" have no continuous probability density function, argument p.opt = "dist" does not apply. For classic methods, when p.opt is "dist", argument num.MC will be ignored. p.opt = "table" use interpolation from pre stored simulated tables. Current version only supports "VEXLER", "MIC", "HOEFFD" and "EL" tests. For Vexler, MIC and EL, since computation is more time-consuming, a warning with estimated execution time will be returned when input size > 100. Input size <= 100 is recommanded for Monte Carlo p-value. For input size > 100 use table. num.MC should be a integer between 100 and 10,000 for acceptable computation times. NA in input is not acceptable. Set rm.na = TRUE to remove. More details see Pearson, Kendall, Spearman, Vexler, Kallenberg, MIC, Hoeffding, EL.

an S4 object of class "testforDEP_result", having attributes: test statistics (TS), p value (p_value) and confidence interval (CI) if apply.

Jeffrey C. Miecznikowski, En-shuo Hsu, Yanhua Chen, Albert Vexler

Technical report: http://sphhp.buffalo.edu/content/dam/sphhp/biostatistics/Documents/techreports/UB-Biostatistics-TR1701.pdf

set.seed(123)
x = runif(100, 0, 1)
y = runif(100, 0, 1)

testforDEP(x, y, test = "SPEARMAN", p.opt = "MC",
           num.MC = 10000, BS.CI = 0, set.seed = TRUE)


#An object of class "testforDEP_result"
#Slot "TS":
#[1] 59.54311

#Slot "p_value":
#[1] 0.6735326

#Slot "CI":
#list()