Lrnr_glm_semiparametric | R Documentation |
This learner provides fitting procedures for semiparametric generalized
linear models using a specified baseline learner and
glm.fit
. Models of the form
linkfun(E[Y|A,W]) = linkfun(E[Y|A=0,W]) + A * f(W)
are supported,
where A
is a binary or continuous interaction variable, W
are
all of the covariates in the task excluding the interaction variable, and
f(W)
is a user-specified parametric function of the
non-interaction-variable covariates (e.g.,
f(W) = model.matrix(formula_sp, W)
). The baseline function
E[Y|A=0,W]
is fit using a user-specified learner, possibly pooled
over values of interaction variable A
, and then projected onto the
semiparametric model.
An R6Class
object inheriting from
Lrnr_base
.
A learner object inheriting from Lrnr_base
with
methods for training and prediction. For a full list of learner
functionality, see the complete documentation of Lrnr_base
.
formula_parametric = NULL
: A formula
object
specifying the parametric function of the non-interaction-variable
covariates.
lrnr_baseline
: A baseline learner for
estimation of the nonparametric component. This can be pooled or
unpooled by specifying return_matrix_predictions
.
interaction_variable = NULL
: An interaction variable name
present in the task's data that will be used to multiply by the
design matrix generated by formula_sp
. If NULL
(default)
then the interaction variable is treated identically 1
. When
this learner is used for estimation of the outcome regression in an
effect estimation procedure (e.g., when using sl3
within
package tmle3
), it is recommended that
interaction_variable
be set as the name of the treatment
variable.
family = NULL
: A family object whose link function specifies the
type of semiparametric model. For
partially-linear least-squares regression,
partially-linear logistic regression, and
partially-linear log-linear regression family
should be set to
guassian()
, binomial()
, and poisson()
,
respectively.
append_interaction_matrix = TRUE
: Whether lrnr_baseline
should be fit on cbind(task$X,A*V)
, where A
is the
interaction_variable
and V
is the design matrix obtained
from formula_sp
. Note that if TRUE
(default) the
resulting estimator will be projected onto the semiparametric model
using glm.fit
. If FALSE
and
interaction_variable
is binary, the semiparametric model is
learned by stratifying on interaction_variable
; Specifically,
lrnr_baseline
is used to estimate E[Y|A=0,W]
by
subsetting to only observations with A = 0
, i.e., subsetting to
only observations with interaction_variable = 0
, and where
W
are the other covariates in the task that are not the
interaction_variable
. In the binary interaction_variable
case, setting append_interaction_matrix = TRUE
allows one to
pool the learning across treatment arms and can enhance performance of
additive models.
return_matrix_predictions = FALSE
: Whether to return a matrix
output with three columns being E[Y|A=0,W]
, E[Y|A=1,W]
,
E[Y|A,W]
in the learner's fit_object
, where A
is
the interaction_variable
and W
are the other covariates
in the task that are not the interaction_variable
. Only used
if the interaction_variable
is binary.
...
: Any additional parameters that can be considered by
Lrnr_base
.
Other Learners:
Custom_chain
,
Lrnr_HarmonicReg
,
Lrnr_arima
,
Lrnr_bartMachine
,
Lrnr_base
,
Lrnr_bayesglm
,
Lrnr_caret
,
Lrnr_cv_selector
,
Lrnr_cv
,
Lrnr_dbarts
,
Lrnr_define_interactions
,
Lrnr_density_discretize
,
Lrnr_density_hse
,
Lrnr_density_semiparametric
,
Lrnr_earth
,
Lrnr_expSmooth
,
Lrnr_gam
,
Lrnr_ga
,
Lrnr_gbm
,
Lrnr_glm_fast
,
Lrnr_glmnet
,
Lrnr_glmtree
,
Lrnr_glm
,
Lrnr_grfcate
,
Lrnr_grf
,
Lrnr_gru_keras
,
Lrnr_gts
,
Lrnr_h2o_grid
,
Lrnr_hal9001
,
Lrnr_haldensify
,
Lrnr_hts
,
Lrnr_independent_binomial
,
Lrnr_lightgbm
,
Lrnr_lstm_keras
,
Lrnr_mean
,
Lrnr_multiple_ts
,
Lrnr_multivariate
,
Lrnr_nnet
,
Lrnr_nnls
,
Lrnr_optim
,
Lrnr_pca
,
Lrnr_pkg_SuperLearner
,
Lrnr_polspline
,
Lrnr_pooled_hazards
,
Lrnr_randomForest
,
Lrnr_ranger
,
Lrnr_revere_task
,
Lrnr_rpart
,
Lrnr_rugarch
,
Lrnr_screener_augment
,
Lrnr_screener_coefs
,
Lrnr_screener_correlation
,
Lrnr_screener_importance
,
Lrnr_sl
,
Lrnr_solnp_density
,
Lrnr_solnp
,
Lrnr_stratified
,
Lrnr_subset_covariates
,
Lrnr_svm
,
Lrnr_tsDyn
,
Lrnr_ts_weights
,
Lrnr_xgboost
,
Pipeline
,
Stack
,
define_h2o_X()
,
undocumented_learner
## Not run:
# simulate some data
set.seed(459)
n <- 200
W <- runif(n, -1, 1)
A <- rbinom(n, 1, plogis(W))
Y_continuous <- rnorm(n, mean = A + W, sd = 0.3)
Y_binary <- rbinom(n, 1, plogis(A + W))
Y_count <- rpois(n, exp(A + W))
data <- data.table::data.table(W, A, Y_continuous, Y_binary, Y_count)
# Make tasks
task_continuous <- sl3_Task$new(
data,
covariates = c("A", "W"), outcome = "Y_continuous"
)
task_binary <- sl3_Task$new(
data,
covariates = c("A", "W"), outcome = "Y_binary"
)
task_count <- sl3_Task$new(
data,
covariates = c("A", "W"), outcome = "Y_count",
outcome_type = "continuous"
)
formula_sp <- ~ 1 + W
# fit partially-linear regression with append_interaction_matrix = TRUE
set.seed(100)
lrnr_glm_sp_gaussian <- Lrnr_glm_semiparametric$new(
formula_sp = formula_sp, family = gaussian(),
lrnr_baseline = Lrnr_glm$new(),
interaction_variable = "A", append_interaction_matrix = TRUE
)
lrnr_glm_sp_gaussian <- lrnr_glm_sp_gaussian$train(task_continuous)
preds <- lrnr_glm_sp_gaussian$predict(task_continuous)
beta <- lrnr_glm_sp_gaussian$fit_object$coefficients
# in this case, since append_interaction_matrix = TRUE, it is equivalent to:
V <- model.matrix(formula_sp, task_continuous$data)
X <- cbind(task_continuous$data[["W"]], task_continuous$data[["A"]] * V)
X0 <- cbind(task_continuous$data[["W"]], 0 * V)
colnames(X) <- c("W", "A", "A*W")
Y <- task_continuous$Y
set.seed(100)
beta_equiv <- coef(glm(X, Y, family = "gaussian"))[c(3, 4)]
# actually, the glm fit is projected onto the semiparametric model
# with glm.fit, no effect in this case
print(beta - beta_equiv)
# fit partially-linear regression w append_interaction_matrix = FALSE`
set.seed(100)
lrnr_glm_sp_gaussian <- Lrnr_glm_semiparametric$new(
formula_sp = formula_sp, family = gaussian(),
lrnr_baseline = Lrnr_glm$new(family = gaussian()),
interaction_variable = "A",
append_interaction_matrix = FALSE
)
lrnr_glm_sp_gaussian <- lrnr_glm_sp_gaussian$train(task_continuous)
preds <- lrnr_glm_sp_gaussian$predict(task_continuous)
beta <- lrnr_glm_sp_gaussian$fit_object$coefficients
# in this case, since append_interaction_matrix = FALSE, it is equivalent to
# the following
cntrls <- task_continuous$data[["A"]] == 0 # subset to control arm
V <- model.matrix(formula_sp, task_continuous$data)
X <- cbind(rep(1, n), task_continuous$data[["W"]])
Y <- task_continuous$Y
set.seed(100)
beta_Y0W <- lrnr_glm_sp_gaussian$fit_object$lrnr_baseline$fit_object$coefficients
# subset to control arm
beta_Y0W_equiv <- coef(
glm.fit(X[cntrls, , drop = F], Y[cntrls], family = gaussian())
)
EY0 <- X %*% beta_Y0W
beta_equiv <- coef(glm.fit(A * V, Y, offset = EY0, family = gaussian()))
print(beta_Y0W - beta_Y0W_equiv)
print(beta - beta_equiv)
# fit partially-linear logistic regression
lrnr_glm_sp_binomial <- Lrnr_glm_semiparametric$new(
formula_sp = formula_sp, family = binomial(),
lrnr_baseline = Lrnr_glm$new(), interaction_variable = "A",
append_interaction_matrix = TRUE
)
lrnr_glm_sp_binomial <- lrnr_glm_sp_binomial$train(task_binary)
preds <- lrnr_glm_sp_binomial$predict(task_binary)
beta <- lrnr_glm_sp_binomial$fit_object$coefficients
# fit partially-linear log-link (relative-risk) regression
# Lrnr_glm$new(family = "poisson") setting requires that lrnr_baseline
# predicts nonnegative values. It is recommended to use poisson
# regression-based learners.
lrnr_glm_sp_poisson <- Lrnr_glm_semiparametric$new(
formula_sp = formula_sp, family = poisson(),
lrnr_baseline = Lrnr_glm$new(family = "poisson"),
interaction_variable = "A",
append_interaction_matrix = TRUE
)
lrnr_glm_sp_poisson <- lrnr_glm_sp_poisson$train(task_count)
preds <- lrnr_glm_sp_poisson$predict(task_count)
beta <- lrnr_glm_sp_poisson$fit_object$coefficients
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.