TaskSurv: Survival Task
In mlr-org/mlr3proba: Probabilistic Supervised Learning for 'mlr3'

TaskSurv

R Documentation

Survival Task

Description

This task specializes mlr3::Task and mlr3::TaskSupervised for possibly-censored survival problems. The target is comprised of survival times and an event indicator. Predefined tasks are stored in mlr3::mlr_tasks.

The task_type is set to "surv".

Super classes

mlr3::Task -> mlr3::TaskSupervised -> TaskSurv

Active bindings

censtype: (character(1))
Returns the type of censoring, one of "right", "left", "counting", "interval", "interval2" or "mstate". Currently, only the "right"-censoring type is fully supported, the rest are experimental and the API will change in the future.

Methods

Inherited methods

Method `new()`

Creates a new instance of this R6 class.

Usage

TaskSurv$new(
  id,
  backend,
  time = "time",
  event = "event",
  time2,
  type = c("right", "left", "interval", "counting", "interval2", "mstate"),
  label = NA_character_
)

Arguments

id: (character(1))
Identifier for the new instance.
backend: (DataBackend)
Either a DataBackend, or any object which is convertible to a DataBackend with as_data_backend(). E.g., a data.frame() will be converted to a DataBackendDataTable.
time: (character(1))
Name of the column for event time if data is right censored, otherwise starting time if interval censored.
event: (character(1))
Name of the column giving the event indicator. If data is right censored then "0"/FALSE means alive (no event), "1"/TRUE means dead (event). If type is "interval" then "0" means right censored, "1" means dead (event), "2" means left censored, and "3" means interval censored. If type is "interval2" then event is ignored.
time2: (character(1))
Name of the column for ending time of the interval for interval censored or counting process data, otherwise ignored.
type: (character(1))
Name of the column giving the type of censoring. Default is 'right' censoring.
label: (character(1))
Label for the new instance.

Details

Depending on the censoring type ("type"), the output of a survival task's "$target_names" is a character() vector with values the names of the columns given by the above initialization arguments. Specifically, the output is as follows (and in the specified order):

For type = "right", "left" or "mstate": ("time", "event")
For type = "interval" or "counting": ("time", "time2", "event")
For type = "interval2": ("time", ⁠"time2⁠)

Method `truth()`

True response for specified row_ids. This is the survival outcome using the Surv format and depends on the censoring type. Defaults to all rows with role "use".

Usage

TaskSurv$truth(rows = NULL)

Arguments

rows: (integer())
Row indices.

Returns

survival::Surv().

Method `formula()`

Creates a formula for survival models with survival::Surv() on the LHS (left hand side).

Usage

TaskSurv$formula(rhs = NULL, reverse = FALSE)

Arguments

rhs: If NULL, RHS (right hand side) is ".", otherwise RHS is "rhs".
reverse: If TRUE then formula calculated with 1 - status.

Returns

stats::formula().

Method `times()`

Returns the (unsorted) outcome times.

Usage

TaskSurv$times(rows = NULL)

Arguments

rows: (integer())
Row indices.

Returns

numeric()

Method `status()`

Returns the event indicator (aka censoring/survival indicator). If censtype is "right" or "left" then 1 is event and 0 is censored. If censtype is "mstate" then 0 is censored and all other values are different events. If censtype is "interval" then 0 is right-censored, 1 is event, 2 is left-censored, 3 is interval-censored. See survival::Surv().

Usage

TaskSurv$status(rows = NULL)

Arguments

rows: (integer())
Row indices.

Returns

integer()

Method `unique_times()`

Returns the sorted unique outcome times for "right", "left" and "mstate" types of censoring.

Usage

TaskSurv$unique_times(rows = NULL)

Arguments

rows: (integer())
Row indices.

Returns

numeric()

Method `unique_event_times()`

Returns the sorted unique event (or failure) outcome times for "right", "left" and "mstate" types of censoring.

Usage

TaskSurv$unique_event_times(rows = NULL)

Arguments

rows: (integer())
Row indices.

Returns

numeric()

Method `risk_set()`

Returns the row_ids of the observations at risk (not dead or censored or had other events in case of multi-state tasks) at the specified time.

Only designed for "right", "left" and "mstate" types of censoring.

Usage

TaskSurv$risk_set(time = NULL)

Arguments

time: (numeric(1))
Time to return risk set for, if NULL returns all row_ids.

Returns

integer()

Method `kaplan()`

Calls survival::survfit() to calculate the Kaplan-Meier estimator.

Usage

TaskSurv$kaplan(strata = NULL, rows = NULL, reverse = FALSE, ...)

Arguments

strata: (character())
Stratification variables to use.
rows: (integer())
Subset of row indices.
reverse: (logical())
If TRUE calculates Kaplan-Meier of censoring distribution (1-status). Default FALSE.
...: (any)
Additional arguments passed down to survival::survfit.formula().

Returns

survival::survfit.object.

Method `reverse()`

Returns the same task with the status variable reversed, i.e., 1 - status. Only designed for "left" and "right" censoring.

Usage

TaskSurv$reverse()

Returns

TaskSurv.

Method `cens_prop()`

Returns the proportion of censoring for this survival task. By default, this is returned for all observations, otherwise only the specified ones (rows).

Only designed for "right" and "left" censoring.

Usage

TaskSurv$cens_prop(rows = NULL)

Arguments

rows: (integer())
Row indices.

Returns

numeric()

Method `admin_cens_prop()`

Returns an estimated proportion of administratively censored observations (i.e. censored at or after a user-specified time point). Our main assumption here is that in an administratively censored dataset, the maximum censoring time is likely close to the maximum event time and so we expect higher proportion of censored subjects near the study end date.

Only designed for "right" and "left" censoring.

Usage

TaskSurv$admin_cens_prop(rows = NULL, admin_time = NULL, quantile_prob = 0.99)

Arguments

rows: (integer())
Row indices.
admin_time: (numeric(1))
Administrative censoring time (in case it is known a priori).
quantile_prob: (numeric(1))
Quantile probability value with which we calculate the cutoff time for administrative censoring. Ignored, if admin_time is given. By default, quantile_prob is equal to 0.99, which translates to a time point very close to the maximum outcome time in the dataset. A lower value will result in an earlier time point and therefore in a more relaxed definition (i.e. higher proportion) of administrative censoring.

Returns

numeric()

Method `dep_cens_prop()`

Returns the proportion of covariates (task features) that are found to be significantly associated with censoring. This function fits a logistic regression model via glm with the censoring status as the response and using all features as predictors. If a covariate is significantly associated with the censoring status, it suggests that censoring may be informative (dependent) rather than random (non-informative). This methodology is more suitable for low-dimensional datasets where the number of features is relatively small compared to the number of observations.

Only designed for "right" and "left" censoring.

Usage

TaskSurv$dep_cens_prop(rows = NULL, method = "holm", sign_level = 0.05)

Arguments

rows: (integer())
Row indices.
method: (character(1))
Method to adjust p-values for multiple comparisons, see p.adjust.methods. Default is "holm".
sign_level: (numeric(1))
Significance level for each coefficient's p-value from the logistic regression model. Default is 0.05.

Returns

numeric()

Method `prop_haz()`

Checks if the data satisfy the proportional hazards (PH) assumption using the Grambsch-Therneau test, Grambsch (1994). Uses cox.zph. This method should be used only for low-dimensional datasets where the number of features is relatively small compared to the number of observations.

Only designed for "right" and "left" censoring.

Usage

TaskSurv$prop_haz()

Returns

numeric()
If no errors, the p-value of the global chi-square test. A p-value < 0.05 is an indication of possible PH violation.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

TaskSurv$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

References

Grambsch, Patricia, Therneau, Terry (1994). “Proportional hazards tests and diagnostics based on weighted residuals.” Biometrika, 81(3), 515–526. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/81.3.515")}, https://doi.org/10.1093/biomet/81.3.515.

Examples

library(mlr3)
task = tsk("lung")

# meta data
task$target_names # target is always (time, status) for right-censoring tasks
task$feature_names
task$formula()

# survival data
task$truth() # survival::Surv() object
task$times() # (unsorted) times
task$status() # event indicators (1 = death, 0 = censored)
task$unique_times() # sorted unique times
task$unique_event_times() # sorted unique event times
task$risk_set(time = 700) # observation ids that are not censored or dead at t = 700
task$kaplan(strata = "sex") # stratified Kaplan-Meier
task$kaplan(reverse = TRUE) # Kaplan-Meier of the censoring distribution

# proportion of censored observations across all dataset
task$cens_prop()
# proportion of censored observations at or after the 95% time quantile
task$admin_cens_prop(quantile_prob = 0.95)
# proportion of variables that are significantly associated with the
# censoring status via a logistic regression model
task$dep_cens_prop() # 0 indicates independent censoring
# data barely satisfies proportional hazards assumption (p > 0.05)
task$prop_haz()
# veteran data is definitely non-PH (p << 0.05)
tsk("veteran")$prop_haz()

mlr-org/mlr3proba documentation built on April 12, 2025, 4:38 p.m.

mlr-org/mlr3proba index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mlr-org/mlr3proba Probabilistic Supervised Learning for 'mlr3'

TaskSurv: Survival Task In mlr-org/mlr3proba: Probabilistic Supervised Learning for 'mlr3'

Survival Task

Description

Super classes

Active bindings

Methods

Public methods

Method new()

Usage

Arguments

Details

Method truth()

Usage

Arguments

Returns

Method formula()

Usage

Arguments

Returns

Method times()

Usage

Arguments

Returns

Method status()

Usage

Arguments

Returns

Method unique_times()

Usage

Arguments

Returns

Method unique_event_times()

Usage

Arguments

Returns

Method risk_set()

Usage

Arguments

Returns

Method kaplan()

Usage

Arguments

Returns

Method reverse()

Usage

Returns

Method cens_prop()

Usage

Arguments

Returns

Method admin_cens_prop()

Usage

Arguments

Returns

Method dep_cens_prop()

Usage

Arguments

Returns

Method prop_haz()

Usage

Returns

Method clone()

Usage

Arguments

References

See Also

Examples

Related to TaskSurv in mlr-org/mlr3proba...

R Package Documentation

Browse R Packages

We want your feedback!

mlr-org/mlr3proba
Probabilistic Supervised Learning for 'mlr3'

TaskSurv: Survival Task
In mlr-org/mlr3proba: Probabilistic Supervised Learning for 'mlr3'

Method `new()`

Method `truth()`

Method `formula()`

Method `times()`

Method `status()`

Method `unique_times()`

Method `unique_event_times()`

Method `risk_set()`

Method `kaplan()`

Method `reverse()`

Method `cens_prop()`

Method `admin_cens_prop()`

Method `dep_cens_prop()`

Method `prop_haz()`

Method `clone()`