InvariantTargetPrediction: Invariant target prediction.
In CondIndTests: Nonlinear Conditional Independence Tests

Description Usage Arguments Value Examples

View source: R/InvariantTargetPrediction.R

Tests the null hypothesis that Y and E are independent given X.

InvariantTargetPrediction(Y, E, X, alpha = 0.05, verbose = FALSE,
  fitWithGam = TRUE, trainTestSplitFunc = caTools::sample.split,
  argsTrainTestSplitFunc = NULL, test = fTestTargetY,
  colNameNoSmooth = NULL, mtry = sqrt(NCOL(X)), ntree = 100,
  nodesize = 5, maxnodes = NULL, permute = TRUE,
  returnModel = FALSE)

`Y`	An n-dimensional vector.
`E`	An n-dimensional vector or an nxq dimensional matrix or dataframe.
`X`	A matrix or dataframe with n rows and p columns.
`alpha`	Significance level. Defaults to 0.05.
`verbose`	If `TRUE`, intermediate output is provided. Defaults to `FALSE`.
`fitWithGam`	If `TRUE`, a GAM is used for the nonlinear regression, else a random forest is used. Defaults to `TRUE`.
`trainTestSplitFunc`	Function to split sample. Defaults to stratified sampling using `caTools::sample.split`, assuming E is a factor.
`argsTrainTestSplitFunc`	Arguments for sampling splitting function.
`test`	Unconditional independence test that tests whether the out-of-sample prediction accuracy is the same when using X only vs. X and E as predictors for Y. Defaults to `fTestTargetY`.
`colNameNoSmooth`	Gam parameter: Name of variables that should enter linearly into the model. Defaults to `NULL`.
`mtry`	Random forest parameter: Number of variables randomly sampled as candidates at each split. Defaults to `sqrt(NCOL(X))`.
`ntree`	Random forest parameter: Number of trees to grow. Defaults to 100.
`nodesize`	Random forest parameter: Minimum size of terminal nodes. Defaults to 5.
`maxnodes`	Random forest parameter: Maximum number of terminal nodes trees in the forest can have. Defaults to NULL.
`permute`	Random forest parameter: If `TRUE`, model that would use X only for predicting Y also includes a random permutation of E. Defaults to `TRUE`.
`returnModel`	If `TRUE`, the fitted quantile regression forest model will be returned. Defaults to `FALSE`.

A list with the following entries:

pvalue The p-value for the null hypothesis that Y and E are independent given X.
model The fitted models if returnModel = TRUE.

# Example 1
n <- 1000
E <- rbinom(n, size = 1, prob = 0.2)
X <- 4 + 2 * E + rnorm(n)
Y <- 3 * (X)^2 + rnorm(n)
InvariantTargetPrediction(Y, as.factor(E), X)
InvariantTargetPrediction(Y, as.factor(E), X, test = wilcoxTestTargetY)

# Example 2
E <- rbinom(n, size = 1, prob = 0.2)
X <- 4 + 2 * E + rnorm(n)
Y <- 3 * E + rnorm(n)
InvariantTargetPrediction(Y, as.factor(E), X)
InvariantTargetPrediction(Y, as.factor(E), X, test = wilcoxTestTargetY)

# Example 3
E <- rnorm(n)
X <- 4 + 2 * E + rnorm(n)
Y <- 3 * (X)^2 + rnorm(n)
InvariantTargetPrediction(Y, E, X)
InvariantTargetPrediction(Y, X, E)
InvariantTargetPrediction(Y, E, X, test = wilcoxTestTargetY)
InvariantTargetPrediction(Y, X, E, test = wilcoxTestTargetY)