propensity_score: Propensity score estimation

View source: R/mlr3.R

propensity_scoreR Documentation

Propensity score estimation

Description

Estimates the propensity scores Pr[D = 1 | Z] for binary treatment assignment D and covariates Z. Either done by taking the empirical mean of D (which should equal roughly 0.5, since we assume a randomized experiment), or by direct machine learning estimation.

Usage

propensity_score(Z, D, estimator = "constant")

Arguments

Z

A numeric design matrix that holds the covariates in its columns.

D

A binary vector of treatment assignment. Value one denotes assignment to the treatment group and value zero assignment to the control group.

estimator

Character specifying the estimator. Must either be equal to 'constant' (estimates the propensity scores by mean(D)), 'lasso', 'random_forest', 'tree', or mlr3 syntax. Note that in case of mlr3 syntax, do not specify if the learner is a regression learner or classification learner. Example: 'mlr3::lrn("ranger", num.trees = 500)' for a random forest learner. Note that this is a string and the absence of the classif. or regr. keywords. See https://mlr3learners.mlr-org.com for a list of mlr3 learners.

Details

The specifications "lasso", "random_forest", and "tree" in estimator correspond to the following mlr3 specifications (we omit the keywords classif. and regr.). "lasso" is a cross-validated Lasso estimator, which corresponds to 'mlr3::lrn("cv_glmnet", s = "lambda.min", alpha = 1)'. "random_forest" is a random forest with 500 trees, which corresponds to 'mlr3::lrn("ranger", num.trees = 500)'. "tree" is a tree learner, which corresponds to 'mlr3::lrn("rpart")'.

Value

An object of class "propensity_score", consisting of the following components:

estimates

A numeric vector of propensity score estimates.

mlr3_objects

"mlr3" objects used for estimation. Only non-empty if mlr3 was used.

References

Rosenbaum P.R., Rubin D.B. (1983). “The Central Role of the Propensity Score in Observational Studies for Causal Effects.” Biometrika, 70(1), 41–55. doi: 10.1093/biomet/70.1.41.

Lang M., Binder M., Richter J., Schratz P., Pfisterer F., Coors S., Au Q., Casalicchio G., Kotthoff L., Bischl B. (2019). “mlr3: A Modern Object-Oriented Machine Learning Framework in R.” Journal of Open Source Software, 4(44), 1903. doi: 10.21105/joss.01903.

Examples

## generate data
set.seed(1)
n  <- 100                        # number of observations
p  <- 5                          # number of covariates
D  <- rbinom(n, 1, 0.5)          # random treatment assignment
Z  <- matrix(runif(n*p), n, p)   # design matrix

## estimate propensity scores via mean(D)...
propensity_score(Z, D, estimator = "constant")

## ... and via SVM with cache size 40
if(require("e1071")){
  propensity_score(Z, D,
   estimator = 'mlr3::lrn("svm", cachesize = 40)')
}


GenericML documentation built on June 18, 2022, 9:09 a.m.