orsf_ice_oob: ORSF Individual Conditional Expectations

View source: R/orsf_pd.R

orsf_ice_oobR Documentation

ORSF Individual Conditional Expectations

Description

Compute individual conditional expectations for an ORSF model. Unlike partial dependence, which shows the expected prediction as a function of one or multiple predictors, individual conditional expectations (ICE) show the prediction for an individual observation as a function of a predictor. You can compute individual conditional expectations three ways using a random forest:

  • using in-bag predictions for the training data

  • using out-of-bag predictions for the training data

  • using predictions for a new set of data

See examples for more details

Usage

orsf_ice_oob(
  object,
  pred_spec,
  pred_horizon = NULL,
  pred_type = "risk",
  expand_grid = TRUE,
  boundary_checks = TRUE,
  n_thread = 1,
  ...
)

orsf_ice_inb(
  object,
  pred_spec,
  pred_horizon = NULL,
  pred_type = "risk",
  expand_grid = TRUE,
  boundary_checks = TRUE,
  n_thread = 1,
  ...
)

orsf_ice_new(
  object,
  pred_spec,
  new_data,
  pred_horizon = NULL,
  pred_type = "risk",
  na_action = "fail",
  expand_grid = TRUE,
  boundary_checks = TRUE,
  n_thread = 1,
  ...
)

Arguments

object

(orsf_fit) a trained oblique random survival forest (see orsf).

pred_spec

(named list or data.frame).

  • If pred_spec is a named list, Each item in the list should be a vector of values that will be used as points in the partial dependence function. The name of each item in the list should indicate which variable will be modified to take the corresponding values.

  • If pred_spec is a data.frame, columns will indicate variable names, values will indicate variable values, and partial dependence will be computed using the inputs on each row.

pred_horizon

(double) a value or vector indicating the time(s) that predictions will be calibrated to. E.g., if you were predicting risk of incident heart failure within the next 10 years, then pred_horizon = 10. pred_horizon can be NULL if pred_type is 'mort', since mortality predictions are aggregated over all event times

pred_type

(character) the type of predictions to compute. Valid options are

  • 'risk' : probability of having an event at or before pred_horizon.

  • 'surv' : 1 - risk.

  • 'chf': cumulative hazard function

  • 'mort': mortality prediction

expand_grid

(logical) if TRUE, partial dependence will be computed at all possible combinations of inputs in pred_spec. If FALSE, partial dependence will be computed for each variable in pred_spec, separately.

boundary_checks

(logical) if TRUE, pred_spec will be checked to make sure the requested values are between the 10th and 90th percentile in the object's training data. If FALSE, these checks are skipped.

n_thread

(integer) number of threads to use while computing predictions. Default is one thread. To use the maximum number of threads that your system provides for concurrent execution, set n_thread = 0.

...

Further arguments passed to or from other methods (not currently used).

new_data

a data.frame, tibble, or data.table to compute predictions in.

na_action

(character) what should happen when new_data contains missing values (i.e., NA values). Valid options are:

  • 'fail' : an error is thrown if new_data contains NA values

  • 'omit' : rows in new_data with incomplete data will be dropped

Value

a data.table containing individual conditional expectations for the specified variable(s) at the specified prediction horizon(s).

Examples

Begin by fitting an ORSF ensemble

library(aorsf)

set.seed(329)

fit <- orsf(data = pbc_orsf, formula = Surv(time, status) ~ . - id)

fit
## ---------- Oblique random survival forest
## 
##      Linear combinations: Accelerated Cox regression
##           N observations: 276
##                 N events: 111
##                  N trees: 500
##       N predictors total: 17
##    N predictors per node: 5
##  Average leaves per tree: 21
## Min observations in leaf: 5
##       Min events in leaf: 1
##           OOB stat value: 0.84
##            OOB stat type: Harrell's C-statistic
##      Variable importance: anova
## 
## -----------------------------------------

Use the ensemble to compute ICE values using out-of-bag predictions:

pred_spec <- list(bili = seq(1, 10, length.out = 25))

ice_oob <- orsf_ice_oob(fit, pred_spec, boundary_checks = FALSE)

ice_oob
##       id_variable id_row pred_horizon bili      pred
##    1:           1      1         1788    1 167.86459
##    2:           1      2         1788    1  21.77000
##    3:           1      3         1788    1 118.36972
##    4:           1      4         1788    1  63.11360
##    5:           1      5         1788    1  20.65211
##   ---                                               
## 6896:          25    272         1788   10  61.93365
## 6897:          25    273         1788   10  78.19472
## 6898:          25    274         1788   10  89.93071
## 6899:          25    275         1788   10  56.40274
## 6900:          25    276         1788   10  99.28738

Much more detailed examples are given in the vignette


aorsf documentation built on Oct. 26, 2023, 5:08 p.m.