CAM: Causal Additive Model

Description Usage Arguments Details Value Author(s) References Examples

Description

fits a causal additive model using the CAM algorithm, see references below

Usage

1
2
3
4
5
6
7
CAM(X, scoreName = "SEMGAM", parsScore = list(numBasisFcts = 10), numCores = 1, 
    maxNumParents = min(dim(X)[2] - 1, round(dim(X)[1]/20)), output = FALSE, 
    variableSel = FALSE, variableSelMethod = selGamBoost, 
    variableSelMethodPars = list(atLeastThatMuchSelected = 0.02, 
    atMostThatManyNeighbors = 10), pruning = FALSE, pruneMethod = selGam, 
    pruneMethodPars = list(cutOffPVal = 0.001, numBasisFcts = 10), intervData = FALSE, 
    intervMat = NA)

Arguments

X

nxp matrix of training inputs (n data points, p dimensions)

scoreName

specifies the model type which is used to compute the score. Default is "SEMGAM" which assumes a generalized additive model class. Other options include "SEMLIN" which fits a linear model.

parsScore

additional parameters can be supported to the score function.

numCores

specifies the number of cores that can be used for computation.

maxNumParents

specifies the maximal number of parents that are allowed in the model.

output

shall output be printed to the console (TRUE/FALSE)

variableSel

specifies whether initial variable selection (Step 1 of CAM algorithm) shall be performed (TRUE) or not (FALSE). Initial variable selection reduces the number of possible parents for a given node and therefore enables computing the causal structure for large p.

variableSelMethod

specifies the method that is used for variable selection. Default is selGamBoost which uses the gamboost function from mboost package. Other options include: selGam (gam() from mgcv), selLm based on linear regression, selLasso based on Lasso regression from package glmnet.

variableSelMethodPars

optional parameters to modify settings of the selection method.

pruning

specifies whether pruning (Step 3 of CAM algorithm) shall be performed (TRUE) or not (FALSE). Pruning reduces the number of edges in the estimated causal structure.

pruneMethod

specifies the method used for the pruning step. Default is selGAM which is based on the gam() function from the mgcv package.

pruneMethodPars

optional parameters to tune the pruning step.

intervData

boolean that indicates whether we use interventional data.

intervMat

the matrix intervMat has the same dimension as X. entry (i,j) == TRUE indicates that in experiment i, variable j has been intervened on.

Details

The code fits a CAM model. See the references below for more details. Identifiability results for the model class can be found in

J. Peters, J. Mooij, D. Janzing, B. Sch\"olkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 15:2009-2053, 2014.

Value

list of attributes of the final estimated causal structure

Adj

adjacency matrix of estimated causal graph

Score

Total edge score of estimated graph

timesVec

Vector containing various time measurements for execution times of the individual steps of the CAM algorithm

Author(s)

Jonas Peters <jonas.peters@tuebingen.mpg.de> and Jan Ernest <ernest@stat.math.ethz.ch>

References

P. B\"uhlmann, J. Peters, J. Ernest: CAM: Causal Additive Models, high-dimensional Order Search and Penalized Regression Annals of Statistics 42:2526-2556, 2014.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
n <- 500
eps1<-rnorm(n)
eps2<-rnorm(n)
eps3<-rnorm(n)
eps4<-rnorm(n)

x2 <- 0.5*eps2
x1 <- 0.9*sign(x2)*(abs(x2)^(0.5))+0.5*eps1
x3 <- 0.8*x2^2+0.5*eps3
x4 <- -0.9*sin(x3) - abs(x1) + 0.5*eps4

X <- cbind(x1,x2,x3,x4)

trueDAG <- cbind(c(0,1,0,0),c(0,0,0,0),c(0,1,0,0),c(1,0,1,0))
## x4 <- x3 <- x2 -> x1 -> x4
## adjacency matrix:
## 0 0 0 1
## 1 0 1 0
## 0 0 0 1
## 0 0 0 0

estDAG <- CAM(X, scoreName = "SEMGAM", numCores = 1, output = TRUE, variableSel = FALSE, 
              pruning = TRUE, pruneMethod = selGam, pruneMethodPars = list(cutOffPVal = 0.001))

cat("true DAG:\n")
show(trueDAG)

cat("estimated DAG:\n")
show(estDAG$Adj)

Example output

Loading required package: glmnet
Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-16

Loading required package: mboost
Loading required package: parallel
Loading required package: stabs
This is mboost 2.9-1. See 'package?mboost' and 'news(package  = "mboost")'
for a complete list of changes.

Loading required package: mgcv
Loading required package: nlme
This is mgcv 1.8-28. For overview type 'help("mgcv-package")'.
number of cores: 1 

 compute score entry for regressing 1 on 1                   

 compute score entry for regressing 1 on 2                   

 compute score entry for regressing 1 on 3                   

 compute score entry for regressing 1 on 4                   

 compute score entry for regressing 2 on 1                   

 compute score entry for regressing 2 on 2                   

 compute score entry for regressing 2 on 3                   

 compute score entry for regressing 2 on 4                   

 compute score entry for regressing 3 on 1                   

 compute score entry for regressing 3 on 2                   

 compute score entry for regressing 3 on 3                   

 compute score entry for regressing 3 on 4                   

 compute score entry for regressing 4 on 1                   

 compute score entry for regressing 4 on 2                   

 compute score entry for regressing 4 on 3                   

 compute score entry for regressing 4 on 4                   
Object size of computeScoreMatTmp:  1040 

 Included edge (from, to)  2 1 

 compute score entry for regressing 1 on 2 3                   

 compute score entry for regressing 1 on 2 4                   
 Included edge (from, to)  3 4 

 compute score entry for regressing 4 on 3 1                   

 compute score entry for regressing 4 on 3 2                   
 Included edge (from, to)  1 4 

 compute score entry for regressing 4 on 1 3 2                   
 Included edge (from, to)  2 3 

 compute score entry for regressing 3 on 2 1                   
 Included edge (from, to)  1 3 

 Included edge (from, to)  2 4 

 Performing pruning ... 
 pruning variable: 1 
considered parents: 2 
vector of p-values: 3.842128e-93 
pruning variable: 2 
considered parents:  
pruning variable: 3 
considered parents: 1 2 
vector of p-values: 0.5745814 5.963104e-24 
pruning variable: 4 
considered parents: 1 2 3 
vector of p-values: 4.4215e-50 0.5727815 4.526757e-70 
amount of time for variable selection: 0 
amount of time computing the initial scoreMat: 0.311 
amount of time checking for cycles: 0 
amount of time computing updates for the scoreMat: 0.194 , doing 6 updates.
amount of time for pruning: 0.091 
amount of time for finding maximum: 0 
amount of time in total: 0.596 
true DAG:
     [,1] [,2] [,3] [,4]
[1,]    0    0    0    1
[2,]    1    0    1    0
[3,]    0    0    0    1
[4,]    0    0    0    0
estimated DAG:
     [,1] [,2] [,3] [,4]
[1,]    0    0    0    1
[2,]    1    0    1    0
[3,]    0    0    0    1
[4,]    0    0    0    0

CAM documentation built on May 2, 2019, 8:24 a.m.