DoubleML | R Documentation |
Abstract base class that can't be initialized.
R6::R6Class object.
all_coef
(matrix()
)
Estimates of the causal parameter(s) for the n_rep
different sample
splits after calling fit()
.
all_dml1_coef
(array()
)
Estimates of the causal parameter(s) for the n_rep
different sample
splits after calling fit()
with dml_procedure = "dml1"
.
all_se
(matrix()
)
Standard errors of the causal parameter(s) for the n_rep
different
sample splits after calling fit()
.
apply_cross_fitting
(logical(1)
)
Indicates whether cross-fitting should be applied. Default is TRUE
.
boot_coef
(matrix()
)
Bootstrapped coefficients for the causal parameter(s) after calling
fit()
and bootstrap()
.
boot_t_stat
(matrix()
)
Bootstrapped t-statistics for the causal parameter(s) after calling
fit()
and bootstrap()
.
coef
(numeric()
)
Estimates for the causal parameter(s) after calling fit()
.
data
(data.table
)
Data object.
dml_procedure
(character(1)
)
A character()
("dml1"
or "dml2"
) specifying the double machine
learning algorithm. Default is "dml2"
.
draw_sample_splitting
(logical(1)
)
Indicates whether the sample splitting should be drawn during
initialization of the object. Default is TRUE
.
learner
(named list()
)
The machine learners for the nuisance functions.
n_folds
(integer(1)
)
Number of folds. Default is 5
.
n_rep
(integer(1)
)
Number of repetitions for the sample splitting. Default is 1
.
params
(named list()
)
The hyperparameters of the learners.
psi
(array()
)
Value of the score function
\psi(W;\theta, \eta)=\psi_a(W;\eta) \theta + \psi_b (W; \eta)
after calling fit()
.
psi_a
(array()
)
Value of the score function component \psi_a(W;\eta)
after
calling fit()
.
psi_b
(array()
)
Value of the score function component \psi_b(W;\eta)
after
calling fit()
.
predictions
(array()
)
Predictions of the nuisance models after calling
fit(store_predictions=TRUE)
.
models
(array()
)
The fitted nuisance models after calling
fit(store_models=TRUE)
.
pval
(numeric()
)
p-values for the causal parameter(s) after calling fit()
.
score
(character(1)
, function()
)
A character(1)
or function()
specifying the score function.
se
(numeric()
)
Standard errors for the causal parameter(s) after calling fit()
.
smpls
(list()
)
The partition used for cross-fitting.
smpls_cluster
(list()
)
The partition of clusters used for cross-fitting.
t_stat
(numeric()
)
t-statistics for the causal parameter(s) after calling fit()
.
tuning_res
(named list()
)
Results from hyperparameter tuning.
new()
DoubleML is an abstract class that can't be initialized.
DoubleML$new()
print()
Print DoubleML objects.
DoubleML$print()
fit()
Estimate DoubleML models.
DoubleML$fit(store_predictions = FALSE, store_models = FALSE)
store_predictions
(logical(1)
)
Indicates whether the predictions for the nuisance functions should be
stored in field predictions
. Default is FALSE
.
store_models
(logical(1)
)
Indicates whether the fitted models for the nuisance functions should be
stored in field models
if you want to analyze the models or extract
information like variable importance. Default is FALSE
.
self
bootstrap()
Multiplier bootstrap for DoubleML models.
DoubleML$bootstrap(method = "normal", n_rep_boot = 500)
method
(character(1)
)
A character(1)
("Bayes"
, "normal"
or "wild"
) specifying the
multiplier bootstrap method.
n_rep_boot
(integer(1)
)
The number of bootstrap replications.
self
split_samples()
Draw sample splitting for DoubleML models.
The samples are drawn according to the attributes n_folds
, n_rep
and apply_cross_fitting
.
DoubleML$split_samples()
self
set_sample_splitting()
Set the sample splitting for DoubleML models.
The attributes n_folds
and n_rep
are derived from the provided
partition.
DoubleML$set_sample_splitting(smpls)
smpls
(list()
)
A nested list()
. The outer lists needs to provide an entry per
repeated sample splitting (length of the list is set as n_rep
).
The inner list is a named list()
with names train_ids
and test_ids
.
The entries in train_ids
and test_ids
must be partitions per fold
(length of train_ids
and test_ids
is set as n_folds
).
self
library(DoubleML) library(mlr3) set.seed(2) obj_dml_data = make_plr_CCDDHNR2018(n_obs=10) dml_plr_obj = DoubleMLPLR$new(obj_dml_data, lrn("regr.rpart"), lrn("regr.rpart")) # simple sample splitting with two folds and without cross-fitting smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5)), test_ids = list(c(6, 7, 8, 9, 10)))) dml_plr_obj$set_sample_splitting(smpls) # sample splitting with two folds and cross-fitting but no repeated cross-fitting smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)), test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5)))) dml_plr_obj$set_sample_splitting(smpls) # sample splitting with two folds and repeated cross-fitting with n_rep = 2 smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)), test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))), list(train_ids = list(c(1, 3, 5, 7, 9), c(2, 4, 6, 8, 10)), test_ids = list(c(2, 4, 6, 8, 10), c(1, 3, 5, 7, 9)))) dml_plr_obj$set_sample_splitting(smpls)
tune()
Hyperparameter-tuning for DoubleML models.
The hyperparameter-tuning is performed using the tuning methods provided in the mlr3tuning package. For more information on tuning in mlr3, we refer to the section on parameter tuning in the mlr3 book.
DoubleML$tune( param_set, tune_settings = list(n_folds_tune = 5, rsmp_tune = mlr3::rsmp("cv", folds = 5), measure = NULL, terminator = mlr3tuning::trm("evals", n_evals = 20), algorithm = mlr3tuning::tnr("grid_search"), resolution = 5), tune_on_folds = FALSE )
param_set
(named list()
)
A named list
with a parameter grid for each nuisance model/learner
(see method learner_names()
). The parameter grid must be an object of
class ParamSet.
tune_settings
(named list()
)
A named list()
with arguments passed to the hyperparameter-tuning with
mlr3tuning to set up
TuningInstance objects.
tune_settings
has entries
terminator
(Terminator)
A Terminator object. Specification of terminator
is required to perform tuning.
algorithm
(Tuner or character(1)
)
A Tuner object (recommended) or key passed to the
respective dictionary to specify the tuning algorithm used in
tnr(). algorithm
is passed as an argument to
tnr(). If algorithm
is not specified by the users,
default is set to "grid_search"
. If set to "grid_search"
, then
additional argument "resolution"
is required.
rsmp_tune
(Resampling or character(1)
)
A Resampling object (recommended) or option passed
to rsmp() to initialize a
Resampling for parameter tuning in mlr3
.
If not specified by the user, default is set to "cv"
(cross-validation).
n_folds_tune
(integer(1)
, optional)
If rsmp_tune = "cv"
, number of folds used for cross-validation.
If not specified by the user, default is set to 5
.
measure
(NULL
, named list()
, optional)
Named list containing the measures used for parameter tuning. Entries in
list must either be Measure objects or keys to be
passed to passed to msr(). The names of the entries must
match the learner names (see method learner_names()
). If set to NULL
,
default measures are used, i.e., "regr.mse"
for continuous outcome
variables and "classif.ce"
for binary outcomes.
resolution
(character(1)
)
The key passed to the respective
dictionary to specify the tuning algorithm used in
tnr(). resolution
is passed as an argument to
tnr().
tune_on_folds
(logical(1)
)
Indicates whether the tuning should be done fold-specific or globally.
Default is FALSE
.
self
summary()
Summary for DoubleML models after calling fit()
.
DoubleML$summary(digits = max(3L, getOption("digits") - 3L))
digits
(integer(1)
)
The number of significant digits to use when printing.
confint()
Confidence intervals for DoubleML models.
DoubleML$confint(parm, joint = FALSE, level = 0.95)
parm
(numeric()
or character()
)
A specification of which parameters are to be given confidence intervals
among the variables for which inference was done, either a vector of
numbers or a vector of names. If missing, all parameters are considered
(default).
joint
(logical(1)
)
Indicates whether joint confidence intervals are computed.
Default is FALSE
.
level
(numeric(1)
)
The confidence level. Default is 0.95
.
A matrix()
with the confidence interval(s).
learner_names()
Returns the names of the learners.
DoubleML$learner_names()
character()
with names of learners.
params_names()
Returns the names of the nuisance models with hyperparameters.
DoubleML$params_names()
character()
with names of nuisance models with hyperparameters.
set_ml_nuisance_params()
Set hyperparameters for the nuisance models of DoubleML models.
Note that in the current implementation, either all parameters have to be set globally or all parameters have to be provided fold-specific.
DoubleML$set_ml_nuisance_params( learner = NULL, treat_var = NULL, params, set_fold_specific = FALSE )
learner
(character(1)
)
The nuisance model/learner (see method params_names
).
treat_var
(character(1)
)
The treatment varaible (hyperparameters can be set treatment-variable
specific).
params
(named list()
)
A named list()
with estimator parameters. Parameters are used for all
folds by default. Alternatively, parameters can be passed in a
fold-specific way if option fold_specific
is TRUE
. In this case, the
outer list needs to be of length n_rep
and the inner list of length
n_folds
.
set_fold_specific
(logical(1)
)
Indicates if the parameters passed in params
should be passed in
fold-specific way. Default is FALSE
. If TRUE
, the outer list needs
to be of length n_rep
and the inner list of length n_folds
.
Note that in the current implementation, either all parameters have to
be set globally or all parameters have to be provided fold-specific.
self
p_adjust()
Multiple testing adjustment for DoubleML models.
DoubleML$p_adjust(method = "romano-wolf", return_matrix = TRUE)
method
(character(1)
)
A character(1)
("romano-wolf"
, "bonferroni"
, "holm"
, etc)
specifying the adjustment method. In addition to "romano-wolf"
,
all methods implemented in p.adjust() can be
applied. Default is "romano-wolf"
.
return_matrix
(logical(1)
)
Indicates if the output is returned as a matrix with corresponding
coefficient names.
numeric()
with adjusted p-values. If return_matrix = TRUE
,
a matrix()
with adjusted p_values.
get_params()
Get hyperparameters for the nuisance model of DoubleML models.
DoubleML$get_params(learner)
learner
(character(1)
)
The nuisance model/learner (see method params_names()
)
named list()
with paramers for the nuisance model/learner.
clone()
The objects of this class are cloneable with this method.
DoubleML$clone(deep = FALSE)
deep
Whether to make a deep clone.
Other DoubleML:
DoubleMLIIVM
,
DoubleMLIRM
,
DoubleMLPLIV
,
DoubleMLPLR
## ------------------------------------------------
## Method `DoubleML$set_sample_splitting`
## ------------------------------------------------
library(DoubleML)
library(mlr3)
set.seed(2)
obj_dml_data = make_plr_CCDDHNR2018(n_obs=10)
dml_plr_obj = DoubleMLPLR$new(obj_dml_data,
lrn("regr.rpart"), lrn("regr.rpart"))
# simple sample splitting with two folds and without cross-fitting
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5)),
test_ids = list(c(6, 7, 8, 9, 10))))
dml_plr_obj$set_sample_splitting(smpls)
# sample splitting with two folds and cross-fitting but no repeated cross-fitting
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)),
test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))))
dml_plr_obj$set_sample_splitting(smpls)
# sample splitting with two folds and repeated cross-fitting with n_rep = 2
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)),
test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))),
list(train_ids = list(c(1, 3, 5, 7, 9), c(2, 4, 6, 8, 10)),
test_ids = list(c(2, 4, 6, 8, 10), c(1, 3, 5, 7, 9))))
dml_plr_obj$set_sample_splitting(smpls)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.