perm.test: Permutation Test for Conditional Independence

View source: R/perm.test.R

perm.testR Documentation

Permutation Test for Conditional Independence

Description

Permutation Test for Conditional Independence

Usage

perm.test(
  formula,
  data,
  p = 0.7,
  nperm = 600,
  subsample = 1,
  metric = "RMSE",
  method = "rf",
  nrounds = 120,
  parametric = FALSE,
  poly = TRUE,
  interaction = TRUE,
  degree = 3,
  tail = NA,
  metricfunc = NULL,
  mlfunc = NULL,
  nthread = 1,
  dag = NA,
  dag_n = NA,
  num_class = NULL,
  progress = TRUE,
  ...
)

Arguments

formula

Model formula or DAGitty object specifying the relationship between dependent and independent variables.

data

A data frame containing the variables specified in the formula.

p

Proportion of data to use for training the model. Default is 0.825.

nperm

Number of permutations to perform. Default is 500.

subsample

The proportion of the data to be used. Default is 1 (no subsampling).

metric

Type of metric: "RMSE", "Kappa" or "Custom". Default is 'RMSE'.

method

The machine learning method to use. Supported methods include "rf", "xgboost", etc. Default is "rf".

nrounds

Number of rounds (trees) for methods such as xgboost and random forest. Default is 120.

parametric

Logical. If TRUE, a parametric p-value is calculated in addition to the empirical p-value. Default is FALSE.

poly

Logical. If TRUE, polynomial terms of the conditional variables are included in the model. Default is TRUE.

interaction

Logical. If TRUE, interaction terms of the conditional variables are included in the model. Default is TRUE.

degree

The degree of polynomial terms to include if poly is TRUE. Default is 3.

tail

Specifies whether the test is one-tailed ("left" or "right") or two-tailed. Default is NA.

metricfunc

An optional custom function to calculate the performance metric based on the model's predictions. Default is NULL.

mlfunc

An optional custom machine learning function to use instead of the predefined methods. Default is NULL.

nthread

Integer. The number of threads to use for parallel processing. Default is 1.

dag

A DAGitty object specifying the directed acyclic graph for the variables. Default is NA.

dag_n

A character string specifying the name of the node in the DAGitty object to be used for conditional independence testing. Default is NA.

num_class

Integer. The number of classes for categorical data (used in xgboost). Default is NULL.

progress

Logical. If TRUE, a progress bar is displayed during the permutation process. Default is TRUE.

...

Additional arguments to pass to the machine learning model fitting function.

Value

An object of class 'CCI' containing the null distribution, observed test statistic, p-values, the machine learning model used, and the data.

See Also

print.CCI, summary.CCI, plot.CCI, QQplot

Examples

set.seed(123)
dat <- data.frame(x1 = rnorm(100),
x2 = rnorm(100),
x3 = rnorm(100),
x4 = rnorm(100),
y = rnorm(100))
perm.test(y ~ x1 | x2 + x3 + x4, data = dat, nperm = 25)

CCI documentation built on Aug. 29, 2025, 5:17 p.m.