test.gen | R Documentation |
This function generates the test statistic or a null distribution through permutation for conditional independence testing. It supports various machine learning methods, including random forests, extreme gradient boosting, and allows for custom metric functions and model fitting functions.
test.gen(
formula,
data,
method = "rf",
metric,
nperm = 60,
subsample = 1,
p = 0.8,
poly = TRUE,
interaction = TRUE,
degree = 3,
nrounds = 600,
nthread = 1,
permutation = FALSE,
metricfunc = NULL,
mlfunc = NULL,
num_class = NULL,
progress = TRUE,
...
)
formula |
Formula specifying the relationship between dependent and independent variables. |
data |
Data frame. The data containing the variables used. |
method |
Character. The modeling method to be used. Options include "xgboost" for gradient boosting, or "rf" for random forests or '"svm" for Support Vector Machine. |
metric |
Character. The type of metric: can be "RMSE", "Kappa" or "Custom. Default is 'RMSE' |
nperm |
Integer. The number of generated Monte Carlo samples. Default is 60. |
subsample |
Numeric. The proportion of the data to be used for subsampling. Default is 1 (no subsampling). |
p |
Numeric. The proportion of the data to be used for training. The remaining data will be used for testing. Default is 0.8. |
poly |
Logical. Whether to include polynomial terms of the conditioning variables. Default is TRUE. |
interaction |
Logical. Whether to include interaction terms of the conditioning variables. Default is TRUE. |
degree |
Integer. The degree of polynomial terms to be included if |
nrounds |
Integer. The number of rounds (trees) for methods like xgboost, ranger, and lightgbm. Default is 500. |
nthread |
Integer. The number of threads to use for parallel processing. Default is 1. |
permutation |
Logical. Whether to perform permutation to generate a null distribution. Default is FALSE. |
metricfunc |
Function. A custom metric function provided by the user. The function must take arguments: |
mlfunc |
Function. A custom machine learning function provided by the user. The function must have the arguments: |
num_class |
Integer. The number of classes for categorical data (used in xgboost and lightgbm). Default is NULL. |
progress |
Function. A logical value indicating whether to show a progress bar during the permutation process. Default is TRUE. |
... |
Additional arguments to pass to the machine learning wrapper functions |
A list containing the test distribution.
set.seed(123)
data <- data.frame(x1 = rnorm(100),
x2 = rnorm(100),
x3 = rnorm(100),
x4 = rnorm(100),
y = rnorm(100))
result <- test.gen(formula = y ~ x1 | x2 + x3 + x4,
metric = "RMSE",
data = data)
hist(result$distribution)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.