Home

/

CRAN

/

seqICP

/

seqICP: Sequential Invariant Causal Prediction

seqICP: Sequential Invariant Causal Prediction
In seqICP: Sequential Invariant Causal Prediction

Description Usage Arguments Details Value Author(s) References See Also Examples

Estimates the causal parents S of the target variable Y using invariant causal prediction and fits a linear model of the form
Y = a X^S + N.

seqICP(X, Y, test = "decoupled", par.test = list(grid = c(0,
  round(nrow(X)/2), nrow(X)), complements = FALSE, link = sum, alpha = 0.05, B =
  100, permutation = FALSE), model = "iid", par.model = list(pknown = FALSE,
  p = 0, max.p = 10), max.parents = ncol(X), stopIfEmpty = TRUE,
  silent = TRUE)

`X`	matrix of predictor variables. Each column corresponds to one predictor variable.
`Y`	vector of target variable, with length(Y)=nrow(X).
`test`	string specifying the hypothesis test used to test for invariance of a parent set S (i.e. the null hypothesis H0_S). The following tests are available: "decoupled", "combined", "trend", "variance", "block.mean", "block.variance", "block.decoupled", "smooth.mean", "smooth.variance", "smooth.decoupled" and "hsic".
`par.test`	parameters specifying hypothesis test. The following parameters are available: `grid`, `complements`, `link`, `alpha`, `B` and `permutation`. The parameter `grid` is an increasing vector of gridpoints used to construct enviornments for change point based tests. If the parameter `complements` is 'TRUE' each environment is compared against its complement if it is 'FALSE' all environments are compared pairwise. The parameter `link` specifies how to compare the pairwise test statistics, generally this is either max or sum. The parameter `alpha` is a numeric value in (0,1) indicting the significance level of the hypothesis test. The parameter `B` is an integer and specifies the number of Monte-Carlo samples (or permutations) used in the approximation of the null distribution. If the parameter `permutation` is 'TRUE' a permuatation based approach is used to approximate the null distribution, if it is 'FALSE' the scaled residuals approach is used.
`model`	string specifying the underlying model class. Either "iid" if Y consists of independent observations or "ar" if Y has a linear time dependence structure.
`par.model`	parameters specifying model. The following parameters are available: `pknown`, `p` and `max.p`. If `pknown` is 'FALSE' the number of lags will be determined by comparing all fits up to `max.p` lags using the AIC criterion. If `pknown` is 'TRUE' the procedure will fit `p` lags.
`max.parents`	integer specifying the maximum size for admissible parents. Reducing this below the number of predictor variables saves computational time but means that the confidence intervals lose their coverage property.
`stopIfEmpty`	if ‘TRUE’, the procedure will stop computing confidence intervals if the empty set has been accepted (and hence no variable can have a signicificant causal effect). Setting to ‘TRUE’ will save computational time in these cases, but means that the confidence intervals lose their coverage properties for values different to 0.
`silent`	If 'FALSE', the procedure will output progress notifications consisting of the currently computed set S together with the p-value resulting from the null hypothesis H0_S

The function can be applied to two types of models
(1) a linear model (model="iid")
Y_i = a X_i^S + N_i
with iid noise N_i and
(2) a linear autoregressive model (model="ar")
Y_t = a_0 X_t^S + ... + a_p (Y_(t-p),X_(t-p)) + N_t
with iid noise N_t.

For both models the invariant prediction procedure is applied using the hypothesis test specified by the test parameter to determine whether a candidate model is invariant. For further details see the references.

object of class 'seqICP' consisting of the following elements

`parent.set`	vector of the estimated causal parents.
`test.results`	matrix containing the result from each individual test as rows.
`S`	list of all the sets that were tested. The position within the list corresponds to the index in the first column of the test.results matrix.
`p.values`	p-value for being not included in the set of true causal parents. (If a p-value is smaller than alpha, the corresponding variable is a member of parent.set.)
`coefficients`	vector of coefficients resulting from a regression based on the estimated parent set.
`stopIfEmpty`	a boolean value indicating whether computations stop as soon as intersection of accepted sets is empty.
`modelReject`	a boolean value indicating if the whole model was rejected (the p-value of the best fitting model is too low).
`pknown`	a boolean value indicating whether the number of lags in the model was known. Only relevant if model was set to "ar".
`alpha`	significance level at which the hypothesis tests were performed.
`n.var`	number of predictor variables.
`model`	either "iid" or "ar" depending on which model was selected.

Niklas Pfister and Jonas Peters

Pfister, N., P. Bühlmann and J. Peters (2017). Invariant Causal Prediction for Sequential Data. ArXiv e-prints (1706.08058).

Peters, J., P. Bühlmann, and N. Meinshausen (2016). Causal inference using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society, Series B (with discussion) 78 (5), 947–1012.

The function seqICP.s allows to perform hypothesis test for individual sets S. For non-linear models the functions seqICPnl and seqICPnl.s can be used.

set.seed(1)

# environment 1
na <- 140
X1a <- 0.3*rnorm(na)
X3a <- X1a + 0.2*rnorm(na)
Ya <- -.7*X1a + .6*X3a + 0.1*rnorm(na)
X2a <- -0.5*Ya + 0.5*X3a + 0.1*rnorm(na)

# environment 2
nb <- 80
X1b <- 0.3*rnorm(nb)
X3b <- 0.5*rnorm(nb)
Yb <- -.7*X1b + .6*X3b + 0.1*rnorm(nb)
X2b <- -0.5*Yb + 0.5*X3b + 0.1*rnorm(nb)

# combine environments
X1 <- c(X1a,X1b)
X2 <- c(X2a,X2b)
X3 <- c(X3a,X3b)
Y <- c(Ya,Yb)
Xmatrix <- cbind(X1, X2, X3)

# Y follows the same structural assignment in both environments
# a and b (cf. the lines Ya <- ... and Yb <- ...).
# The direct causes of Y are X1 and X3.
# A linear model considers X1, X2 and X3 as significant.
# All these variables are helpful for the prediction of Y.
summary(lm(Y~Xmatrix))

# apply seqICP to the same setting
seqICP.result <- seqICP(X = Xmatrix, Y,
par.test = list(grid = seq(0, na + nb, (na + nb)/10), complements = FALSE, link = sum,
alpha = 0.05, B =100), max.parents = 4, stopIfEmpty=FALSE, silent=FALSE)
summary(seqICP.result)
# seqICP is able to infer that X1 and X3 are causes of Y

Call:
lm(formula = Y ~ Xmatrix)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.205831 -0.061317 -0.001113  0.057515  0.266640 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.001799   0.005980   0.301    0.764    
XmatrixX1   -0.583158   0.027397 -21.285  < 2e-16 ***
XmatrixX2   -0.379482   0.047765  -7.945 1.06e-13 ***
XmatrixX3    0.687121   0.018082  38.000  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08813 on 216 degrees of freedom
Multiple R-squared:  0.904,	Adjusted R-squared:  0.9027 
F-statistic: 678.1 on 3 and 216 DF,  p-value: < 2.2e-16

Currently fitting set S = {}
p-value: 0.02
Currently fitting set S = {1}
p-value: 0.02
Currently fitting set S = {2}
p-value: 0.02
Currently fitting set S = {3}
p-value: 0.02
Currently fitting set S = {1, 2}
p-value: 0.02
Currently fitting set S = {1, 3}
p-value: 0.32
Currently fitting set S = {2, 3}
p-value: 0.02
Currently fitting set S = {1, 2, 3}
p-value: 0.2

 Invariant Linear Causal Regression at level 0.05
 Variables X1, X3 show a significant causal effect
 
           coefficient lower bound upper bound  p-value  
intercept         0.0    -0.05900      0.0179       NA  
X1               -0.7    -0.75200     -0.5292     0.02 *
X2                0.0     0.00000      0.0000     0.32  
X3                0.6     0.57000      0.7228     0.02 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

seqICP documentation built on May 2, 2019, 5:51 a.m.

seqICP index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

seqICP
Sequential Invariant Causal Prediction

seqICP: Sequential Invariant Causal Prediction
In seqICP: Sequential Invariant Causal Prediction

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to seqICP in seqICP...

R Package Documentation

Browse R Packages

We want your feedback!

seqICP Sequential Invariant Causal Prediction

seqICP: Sequential Invariant Causal Prediction In seqICP: Sequential Invariant Causal Prediction

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to seqICP in seqICP...

R Package Documentation

Browse R Packages

We want your feedback!

seqICP
Sequential Invariant Causal Prediction

seqICP: Sequential Invariant Causal Prediction
In seqICP: Sequential Invariant Causal Prediction