spglm: spglm Semiparametric generalized linear models for causal...
In Larsvanderlaan/causalGLM: Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.

spglm

R Documentation

spglm Semiparametric generalized linear models for causal inference Supports flexible semiparametric conditional average treatment effect (CATE), conditional odds ratio (OR), and conditional relative risk (RR) estimation Highly Adaptive Lasso (HAL) (see `fit_hal`), a flexible and adaptive spline regression estimator, is recommended for medium-small to large sample sizes.

Description

spglm Semiparametric generalized linear models for causal inference Supports flexible semiparametric conditional average treatment effect (CATE), conditional odds ratio (OR), and conditional relative risk (RR) estimation Highly Adaptive Lasso (HAL) (see fit_hal), a flexible and adaptive spline regression estimator, is recommended for medium-small to large sample sizes.

Usage

spglm(
  formula,
  data,
  W,
  A,
  Y,
  estimand = c("CATE", "OR", "RR"),
  learning_method = c("HAL", "SuperLearner", "glm", "glmnet", "gam", "mars", "ranger",
    "xgboost"),
  append_interaction_matrix = TRUE,
  cross_fit = FALSE,
  sl3_Learner_A = NULL,
  sl3_Learner_Y = NULL,
  wrap_in_Lrnr_glm_sp = TRUE,
  HAL_args_Y0W = list(smoothness_orders = 1, max_degree = 1, num_knots = c(10, 5, 1)),
  HAL_fit_control = list(parallel = F),
  sl3_Learner_var_Y = Lrnr_glmnet$new(family = "poisson"),
  delta_epsilon = 0.1,
  verbose = TRUE,
  warn = TRUE,
  ...
)

Arguments

`formula`	A R formula object specifying the parametric form of CATE, OR, or RR (depending on method).
`data`	A data.frame or matrix containing the numeric values corresponding with the nodes `W`, `A` and `Y`. Or a `spglm` fit object in which case previous ML fits are reused in computation. Note, only pass in a previous fit object for the same estimand and for subformulas. (See vignette)
`W`	A character vector of covariates contained in `data`
`A`	A character name for the treatment assignment variable contained in `data`
`Y`	A character name for the outcome variable contained in `data` (outcome can be continuous, nonnegative or binary depending on method)
`estimand`	Estimand/parameter to estimate. Choices are: 'CATE': Estimate conditional average treatment effect with `Param_spCATE` assuming it satisfies parametric model `formula`. 'OR': Estimate conditional odds ratio with `Param_spOR` assuming it satisfies parametric model `formula`. 'RR': Estimate conditional relative risk with `Param_spRR` assuming it satisfies parametric model `formula`.
`learning_method`	Machine-learning method to use. This is overrided if argument `sl3_Learner` is provided. Options are: "SuperLearner: A stacked ensemble of all of the below that utilizes cross-validation to adaptivelly choose the best learner. "HAL": Adaptive robust automatic machine-learning using the Highly Adaptive Lasso `hal9001`. See arguments`HAL_args_Y0W`. "glm": Fit nuisances with parametric model. "glmnet": Learn using lasso with glmnet. "gam": Learn using generalized additive models with mgcv. "mars": Multivariate adaptive regression splines with `earth`. "ranger": Robust random-forests with the package `Ranger` "xgboost": Learn using a default cross-validation tuned xgboost library with max_depths 3 to 7. Note speed can vary significantly depending on learner choice!
`append_interaction_matrix`	Default: TRUE. This argument is passed to `Lrnr_glm_semiparametric`. This is a boolean for whether to estimate the conditional mean/regression of Y by combining observations with A=0,A=1 ('TRUE'), or to first E[Y\|A=0,W] nonparametrically with `sl3_Learner_Y` or `learning_method` and then learning the parametric component with offsetted parametric regression ('FALSE'). If 'TRUE' the design matrix passed to the regression algorithm/learner for 'Y' is 'cbind(W,A*V)' where 'V = model.matrix(formula, as.data.frame(W))' is the design matrix specified by the argument `formula`. Therefore, it may not be necessary to use learners that model (treatment) interactions when this argument is TRUE. The resulting estimators are projected onto the semiparametric model, ensuring compatibility with the statistical model assumptions. In high dimensions, pool_A_when_training = FALSE may be preferred to prevent dilution of the treatment interactions in the fitting.
`cross_fit`	Whether to cross-fit the initial estimator. This is always set to FALSE if argument `sl3_Learner` is provided. learning_method = 'SuperLearner' is always cross-fitted (default). learning_method = 'xgboost' and 'ranger' are always cross-fitted regardless of the value of `cross_fit` All other learning_methods are only cross-fitted if 'cross_fit=TRUE'. Note, it is not necessary to cross-fit glm, glmnet, gam or mars as long as the dimension of W is not too high. In smaller samples and lower dimensions, it may fact hurt to cross-fit.
`sl3_Learner_A`	A `sl3` Learner object to use to estimate nuisance function P(A=1\|W) with machine-learning. Note, `cross_fit` is automatically set to FALSE if this argument is provided. If you wish to cross-fit the learner `sl3_Learner` then do: sl3_Learner <- Lrnr_cv$new(sl3_Learner). Cross-fitting is recommended for all tree-based algorithms like random-forests and gradient-boosting.
`sl3_Learner_Y`	A `sl3` Learner object to use to estimate nuisance functions [Y\|A=1,W] and E[Y\|A=0,W] (depending on method) with machine-learning. Note, `cross_fit` is automatically set to FALSE if this argument is provided. Keep in mind the value of the argument `pool_A_when_training`. If FALSE then E[Y\|A=0,W] is estimated by itself. Therefore, it may not be needed to add interactions, since treatment interactions are automatic by stratification. If TRUE, the design matrix passed to the pooled learner contains A*V where V is the design matrix obtained from `formula`. For some learners, it may also be unnecessary to include interactions in this case. #' If you wish to cross-fit the learner `sl3_Learner` then do: sl3_Learner <- Lrnr_cv$new(sl3_Learner). Cross-fitting is recommended for all tree-based algorithms like random-forests and gradient-boosting.
`wrap_in_Lrnr_glm_sp`	Mostly for internal use (should be TRUE usually). Whether `sl3_Learner_Y` should be wrapped in a `Lrnr_glm_semiparametric` object.
`HAL_args_Y0W`	A list of parameters for the semiparametric Highly Adaptive Lasso estimator for E[Y\|A=0,W]. Possible parameters are: 1. 'smoothness_orders': Smoothness order for HAL estimator of E[Y\|A=0,W] (see `fit_hal`) smoothness_order_Y0W = 1 is piece-wise linear. smoothness_order_Y0W = 0 is piece-wise constant. 2. 'max_degree': Max interaction degree for HAL estimator of E[Y\|A=0,W] (see `fit_hal`) 3. 'num_knots': A vector of the number of knots by interaction degree for HAL estimator of E[Y\|A=0,W] (see `fit_hal`). Used to generate spline basis functions.
`HAL_fit_control`	See the argument 'fit_control' of (see `fit_hal`).
`sl3_Learner_var_Y`	A `sl3`-Learner for the conditional variance of 'Y'. Only used if 'estimand = "CATE"' and by default is estimated using Poisson-link LASSO regression with 'Lrnr_glmnet'. If conditional variance is constant, set 'sl3_Learner_var_Y = Lrnr_mean$new()'.
`delta_epsilon`	Step size of iterative targeted maximum likelihood estimator. 'delta_epsilon = 1 ' leads to large step sizes and fast convergence. 'delta_epsilon = 0.005' leads to slower convergence but possibly better performance. Useful to set to a large value in high dimensions.
`...`	Not used

Larsvanderlaan/causalGLM documentation built on April 14, 2022, 12:51 a.m.

Larsvanderlaan/causalGLM index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Larsvanderlaan/causalGLM
Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.

spglm: spglm Semiparametric generalized linear models for causal...
In Larsvanderlaan/causalGLM: Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.

Description

Usage

Arguments

Related to spglm in Larsvanderlaan/causalGLM...

R Package Documentation

Browse R Packages

We want your feedback!

Larsvanderlaan/causalGLM Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.

spglm: spglm Semiparametric generalized linear models for causal... In Larsvanderlaan/causalGLM: Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.

Description

Usage

Arguments

Related to spglm in Larsvanderlaan/causalGLM...

R Package Documentation

Browse R Packages

We want your feedback!

Larsvanderlaan/causalGLM
Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.

spglm: spglm Semiparametric generalized linear models for causal...
In Larsvanderlaan/causalGLM: Interpretable and robust causal inference for heterogeneous treatment effects using generalized linear models with targeted machine-learning.