mlr_pipeops_trafotask_survclassif_disctime | R Documentation |
Transform TaskSurv to TaskClassif by dividing continuous
time into multiple time intervals for each observation.
This transformation creates a new target variable disc_status
that indicates
whether an event occurred within each time interval.
This approach facilitates survival analysis within a classification framework
using discrete time intervals (Tutz et al. 2016).
This PipeOp can be instantiated via the
dictionary mlr3pipelines::mlr_pipeops
or with the associated sugar function mlr3pipelines::po()
:
PipeOpTaskSurvClassifDiscTime$new() mlr_pipeops$get("trafotask_survclassif_disctime") po("trafotask_survclassif_disctime")
PipeOpTaskSurvClassifDiscTime has one input channel named "input", and two output channels, one named "output" and the other "transformed_data".
During training, the "output" is the "input" TaskSurv transformed to a
TaskClassif.
The target column is named "disc_status"
and indicates whether an event occurred
in each time interval.
An additional numeric feature named "tend"
contains the end time point of each interval.
Lastly, the "output" task has a column with the original observation ids,
under the role "original_ids"
.
The "transformed_data" is an empty data.table.
During prediction, the "input" TaskSurv is transformed to the "output"
TaskClassif with "disc_status"
as target and the "tend"
feature included.
The "transformed_data" is a data.table with columns the "disc_status"
target of the "output" task, the "id"
(original observation ids),
"obs_times"
(observed times per "id"
) and "tend"
(end time of each interval).
This "transformed_data" is only meant to be used with the PipeOpPredClassifSurvDiscTime.
The $state
contains information about the cut
parameter used.
The parameters are
cut :: numeric()
Split points, used to partition the data into intervals based on the time
column.
If unspecified, all unique event times will be used.
If cut
is a single integer, it will be interpreted as the number of equidistant
intervals from 0 until the maximum event time.
max_time :: numeric(1)
If cut
is unspecified, this will be the last possible event time.
All event times after max_time
will be administratively censored at max_time.
Needs to be greater than the minimum event time in the given task.
mlr3pipelines::PipeOp
-> PipeOpTaskSurvClassifDiscTime
new()
Creates a new instance of this R6 class.
PipeOpTaskSurvClassifDiscTime$new(id = "trafotask_survclassif_disctime")
id
(character(1)
)
Identifier of the resulting object.
clone()
The objects of this class are cloneable with this method.
PipeOpTaskSurvClassifDiscTime$clone(deep = FALSE)
deep
Whether to make a deep clone.
Tutz, Gerhard, Schmid, Matthias (2016). Modeling Discrete Time-to-Event Data, series Springer Series in Statistics. Springer International Publishing. ISBN 978-3-319-28156-8 978-3-319-28158-2, http://link.springer.com/10.1007/978-3-319-28158-2.
pipeline_survtoclassif_disctime
Other Transformation PipeOps:
mlr_pipeops_trafopred_classifsurv_IPCW
,
mlr_pipeops_trafopred_classifsurv_disctime
,
mlr_pipeops_trafopred_regrsurv_pem
,
mlr_pipeops_trafotask_survclassif_IPCW
,
mlr_pipeops_trafotask_survregr_pem
## Not run:
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
task = tsk("lung")
# transform the survival task to a classification task
# all unique event times are used as cutpoints
po_disc = po("trafotask_survclassif_disctime")
task_classif = po_disc$train(list(task))[[1L]]
# the end time points of the discrete time intervals
unique(task_classif$data(cols = "tend"))[[1L]]
# train a classification learner
learner = lrn("classif.log_reg", predict_type = "prob")
learner$train(task_classif)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.