CausalANOVA | R Documentation |
CausalANOVA
estimates coefficients of the specified ANOVA with
regularization. By taking differences in coefficients, the function recovers
the AMEs and AMIEs.
CausalANOVA(
formula,
int2.formula = NULL,
int3.formula = NULL,
data,
nway = 1,
pair.id = NULL,
diff = FALSE,
screen = FALSE,
screen.type = "fixed",
screen.num.int = 3,
collapse = FALSE,
collapse.type = "fixed",
collapse.cost = 0.3,
family = "binomial",
cluster = NULL,
maxIter = 50,
eps = 1e-05,
fac.level = NULL,
ord.fac = NULL,
select.prob = FALSE,
boot = 100,
seed = 1234,
verbose = TRUE
)
formula |
A formula that specifies outcome and treatment variables. |
int2.formula |
(optional). A formula that specifies two-way interactions. |
int3.formula |
(optional). A formula that specifies three-way interactions. |
data |
An optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)', typically the environment from which 'CausalANOVA' is called. |
nway |
With |
pair.id |
(optional).Unique identifiers for each pair of comparison.
This option is used when |
diff |
A logical indicating whether the outcome is the choice between a
pair. If |
screen |
A logical indicating whether select significant factor
interactions with |
screen.type |
Type for screening factor interactions. (1)
|
screen.num.int |
(optional).The number of factor interactions to
select. This option is used when and |
collapse |
A logical indicating whether to collapse insignificant
levels within factors. With |
collapse.type |
Type for collapsing levels within factors. (1)
|
collapse.cost |
(optional).A cost parameter ranging from 0 to 1. 1 corresponds to no collapsing. The closer to 0, the stronger regularization. Default is 0.3. |
family |
A family of outcome variables. |
cluster |
Unique identifies with which cluster standard errors are computed. |
maxIter |
The number of maximum iteration for |
eps |
A tolerance parameter in the internal optimization algorithm. |
fac.level |
(optional). A vector containing the number of levels in
each factor. The order of |
ord.fac |
(optional). Logical vectors indicating whether each factor
has ordered ( |
select.prob |
(optional). A logical indicating whether selection probabilities are computed. This option might take time. |
boot |
The number of bootstrap replicates for |
seed |
Seed for bootstrap. |
verbose |
Whether it prints the value of a cost parameter used. |
Regularization: screen
and collapse
.
Users can implement regularization in order to reduces false discovery rate and facilitates interpretation. This is particularly useful when analyzing factorial experiments with a large number of factors, each having many levels.
When screen=TRUE
, the function selects
significant factor interactions with glinternet
(Lim and Hastie 2015)
before estimating the AMEs and AMIEs. This option is recommended when there
are many factors, e.g., more than 6 factors. Alternatively, users can
pre-specify interactions of interest using int2.formula
and
int3.formula
.
When collapse=TRUE
, the function collapses
insignificant levels within each factor by GashANOVA (Post and Bondell 2013)
before estimating the AMEs and AMIEs. This option is recommended when there
are many levels within some factors, e.g., more than 6 levels.
Inference after Regularization:
When screen=TRUE
or
collapse=TRUE
, in order to make valid inference after regularization,
we recommend to use test.CausalANOVA
function. It takes the output
from CausalANOVA
function and estimate the AMEs and AMIEs with
newdata
and provide confidence intervals. Ideally, users should split
samples into two; use a half for regularization with CausalANOVA
function and use the other half for inference with test.CausalANOVA
.
If users do not need regularization, specify screen=FALSE
and
collapse=FALSE
. The function estimates the AMEs and AMIEs and compute
confidence intervals with the full sample.
Suggested Workflow: (See Examples below as well)
Specify
the order of levels within each factor using levels()
. When
collapse=TRUE
, the function places penalties on the differences
between adjacent levels when levels are ordered, it is crucial to specify
the order of levels within each factor carefully.
Run
CausalANOVA
.
Specify formula
to indicate
outcomes and treatment variables and nway
to indicate the order of
interactions.
Specify diff=TRUE
and pair.id
if the
outcome is the choice between a pair.
Specify screen
.
screen=TRUE
to implement data-driven selection of factor
interactions. screen=FALSE
to specify interactions through
int2.formula
and int3.formula
by hand.
Specify
collapse
. collapse=TRUE
to implement data-driven collapsing of
insignificant levels. collapse=FALSE
to use the original number of
levels.
Run test.CausalANOVA
when select=TRUE
or
collapse=TRUE
.
Run summary
and plot
to explore
the AMEs and AMIEs.
Estimate conditional effects using
ConditionalEffect
function and visualize them using plot
function.
intercept |
An intercept of the estimated ANOVA model.If
|
formula |
The
|
coefs |
A named vector of coefficients of the estimated ANOVA model. |
vcov |
The
variance-covariance matrix for |
CI.table |
The summary of AMEs and AMIEs
with confidence intervals. Only when |
AME |
The estimated AMEs with the grand-mean as baselines. |
AMIE2 |
The estimated two-way AMIEs with the grand-mean as baselines. |
AMIE3 |
The estimated three-way AMIEs with the grand-mean as baselines. |
... |
arguments passed to the function or arguments only for the internal use. |
Naoki Egami and Kosuke Imai.
Egami, Naoki and Kosuke Imai. 2019. Causal Interaction in Factorial Experiments: Application to Conjoint Analysis, Journal of the American Statistical Association. http://imai.fas.harvard.edu/research/files/int.pdf
Lim, M. and Hastie, T. 2015. Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics 24, 3, 627–654.
Post, J. B. and Bondell, H. D. 2013. Factor selection and structural identification in the interaction anova model. Biometrics 69, 1, 70–79.
cv.CausalANOVA
data(Carlson)
## Specify the order of each factor
Carlson$newRecordF<- factor(Carlson$newRecordF,ordered=TRUE,
levels=c("YesLC", "YesDis","YesMP",
"noLC","noDis","noMP","noBusi"))
Carlson$promise <- factor(Carlson$promise,ordered=TRUE,levels=c("jobs","clinic","education"))
Carlson$coeth_voting <- factor(Carlson$coeth_voting,ordered=FALSE,levels=c("0","1"))
Carlson$relevantdegree <- factor(Carlson$relevantdegree,ordered=FALSE,levels=c("0","1"))
## #######################################
## Without Screening and Collapsing
## #######################################
#################### only AMEs ####################
fit1 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree,
data=Carlson, pair.id=Carlson$contestresp, diff=TRUE,
cluster=Carlson$respcodeS, nway=1)
summary(fit1)
plot(fit1)
#################### AMEs and two-way AMIEs ####################
fit2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree,
int2.formula = ~ newRecordF:coeth_voting,
data=Carlson, pair.id=Carlson$contestresp,diff=TRUE,
cluster=Carlson$respcodeS, nway=2)
summary(fit2)
plot(fit2, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting"))
ConditionalEffect(fit2, treat.fac="newRecordF", cond.fac="coeth_voting")
## Not run:
#################### AMEs and two-way and three-way AMIEs ####################
## Note: All pairs within thee-way interactions should show up in int2.formula (Strong Hierarchy).
fit3 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree,
int2.formula = ~ newRecordF:promise + newRecordF:coeth_voting
+ promise:coeth_voting,
int3.formula = ~ newRecordF:promise:coeth_voting,
data=Carlson, pair.id=Carlson$contestresp,diff=TRUE,
cluster=Carlson$respcodeS, nway=3)
summary(fit3)
plot(fit3, type="AMIE", fac.name=c("newRecordF","promise", "coeth_voting"),space=25,adj.p=2.2)
## End(Not run)
## #######################################
## With Screening and Collapsing
## #######################################
## Sample Splitting
train.ind <- sample(unique(Carlson$respcodeS), 272, replace=FALSE)
test.ind <- setdiff(unique(Carlson$respcodeS), train.ind)
Carlson.train <- Carlson[is.element(Carlson$respcodeS,train.ind), ]
Carlson.test <- Carlson[is.element(Carlson$respcodeS,test.ind), ]
#################### AMEs and two-way AMIEs ####################
fit.r2 <- CausalANOVA(formula=won ~ newRecordF + promise + coeth_voting + relevantdegree,
data=Carlson.train, pair.id=Carlson.train$contestresp,diff=TRUE,
screen=TRUE, collapse=TRUE,
cluster=Carlson.train$respcodeS, nway=2)
summary(fit.r2)
## refit with test.CausalANOVA
fit.r2.new <- test.CausalANOVA(fit.r2, newdata=Carlson.test, diff=TRUE,
pair.id=Carlson.test$contestresp, cluster=Carlson.test$respcodeS)
summary(fit.r2.new)
plot(fit.r2.new)
plot(fit.r2.new, type="ConditionalEffect", fac.name=c("newRecordF","coeth_voting"))
ConditionalEffect(fit.r2.new, treat.fac="newRecordF", cond.fac="coeth_voting")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.