mlr_graphs_survtoregr_pem | R Documentation |
Wrapper around multiple PipeOps to help in creation of complex survival reduction methods.
pipeline_survtoregr_pem(
learner,
cut = NULL,
max_time = NULL,
graph_learner = FALSE
)
learner |
LearnerRegr |
cut |
|
max_time |
|
graph_learner |
|
A brief mathematical summary of PEMs (see referenced article for more detail):
PED Transformation:
Survival data is converted into piece-wise exponential data (PED) format.
Key elements are: Continuous time is divided into j = 1, \ldots, J
intervals for each subject, i = 1, \ldots, n
.
A status variable in each entry indicates whether an event or censoring occurred during that interval. For any subject, data entries are
created only up until the interval including the event time. An offset column is introduced and represents the logarithm of the time a subject spent in any given interval.
For more details, see pammtools::as_ped()
.
Hazard Estimation with PEM: The PED transformation combined with the working assumption
\delta_{ij} \stackrel{\text{iid}}{\sim} Poisson \left( \mu_{ij} = \lambda_{ij} t_{ij} \right),
where \delta_{ij}
denotes the event or censoring indicator, allows framing the problem of piecewise constant hazard estimation as a poisson regression with offset.
Specifically, we want to estimate
\lambda(t \mid \mathbf{x}_i) := exp(g(x_{i},t_{j})), \quad \forall t \in [t_{j-1}, t_{j}), \quad i = 1, \dots, n.
g(x_{i},t_{j})
is a general function of features x
and t
, i.e. a learner, and may include non-linearity and complex feature interactions.
Two important prerequisites of the learner are its capacity to model a poisson likelihood and accommodate the offset.
From Piecewise Hazards to Survival Probabilities: Lastly, the computed hazards are back transformed to survival probabilities via the following identity
S(t | \mathbf{x}) = \exp \left( - \int_{0}^{t} \lambda(s | \mathbf{x}) \, ds \right) = \exp \left( - \sum_{j = 1}^{J} \lambda(j | \mathbf{x}) d_j\, \right),
where d_j
specifies the duration of interval j
.
The previous considerations are reflected in the pipeline which consists of the following steps:
PipeOpTaskSurvRegrPEM Converts TaskSurv to a TaskRegr.
A LearnerRegr is fit and predicted on the new TaskRegr
.
PipeOpPredRegrSurvPEM transforms the resulting PredictionRegr to PredictionSurv.
mlr3pipelines::Graph or mlr3pipelines::GraphLearner
Bender, Andreas, Groll, Andreas, Scheipl, Fabian (2018). “A generalized additive model approach to time-to-event analysis.” Statistical Modelling, 18(3-4), 299–321. https://doi.org/10.1177/1471082X17748083.
Other pipelines:
mlr_graphs_crankcompositor
,
mlr_graphs_distrcompositor
,
mlr_graphs_probregr
,
mlr_graphs_responsecompositor
,
mlr_graphs_survaverager
,
mlr_graphs_survbagging
,
mlr_graphs_survtoclassif_IPCW
,
mlr_graphs_survtoclassif_disctime
## Not run:
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
task = tsk("lung")
part = partition(task)
# typically model formula and features types are extracted from the task
learner = lrn("regr.gam", family = "poisson")
grlrn = ppl(
"survtoregr_pem",
learner = learner,
graph_learner = TRUE
)
grlrn$train(task, row_ids = part$train)
grlrn$predict(task, row_ids = part$test)
# In some instances special formulas can be specified in the learner
learner = lrn("regr.gam", family = "poisson", formula = pem_status ~ s(tend) + s(age) + meal.cal)
grlrn = ppl(
"survtoregr_pem",
learner = learner,
graph_learner = TRUE
)
grlrn$train(task, row_ids = part$train)
grlrn$predict(task, row_ids = part$test)
# if necessary encode data before passing to learner with e.g. po("encode"), po("modelmatrix"), etc.
# with po("modelmatrix") feature types and formula can be adjusted at the same time
cut = round(seq(0, max(task$data()$time), length.out = 20))
learner = as_learner(
po("modelmatrix", formula = ~ as.factor(tend) + .) %>>%
lrn("regr.glmnet", family = "poisson", lambda = 0)
)
grlrn = ppl(
"survtoregr_pem",
learner = learner,
cut = cut,
graph_learner = TRUE
)
grlrn$train(task, row_ids = part$train)
grlrn$predict(task, row_ids = part$test)
# xgboost regression learner
learner = as_learner(
po("modelmatrix", formula = ~ .) %>>%
lrn("regr.xgboost", objective = "count:poisson", nrounds = 100, eta = 0.1)
)
grlrn = ppl(
"survtoregr_pem",
learner = learner,
graph_learner = TRUE
)
grlrn$train(task, row_ids = part$train)
grlrn$predict(task, row_ids = part$test)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.