knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE )
library(recforest) library(dplyr)
We use the built-in dataset bladder1_recforest
for this example. We build two subsamples of initial data for training and testing the model.
data("bladder1_recforest") id_individuals_bladder1_recforest <- unique(bladder1_recforest$id) train_ids <- sample(id_individuals_bladder1_recforest, size = 100, replace = FALSE) test_ids <- setdiff(id_individuals_bladder1_recforest, train_ids) train_bladder1_recforest <- bladder1_recforest %>% filter(id %in% train_ids) test_bladder1_recforest <- bladder1_recforest %>% filter(id %in% test_ids)
Hyperparameters are user-fixed (to be optimized in real-world settings). Considering the small number of predictors, mtry
was set to 2. For further details on hyperparameters, call ?train_forest
.
set.seed(1234) trained_forest <- train_forest( data = train_bladder1_recforest, id_var = "id", covariates = c("treatment", "number", "size"), time_vars = c("t.start", "t.stop"), death_var = "death", event = "event", n_trees = 3, n_bootstrap = round(2 * length(train_ids) / 3), mtry = 2, minsplit = 3, nodesize = 15, method = "NAa", min_score = 5, max_nodes = 20, seed = 111, parallel = FALSE, verbose = FALSE )
Predictions from recforest model are the expected mean cumulative number of recurrent events for each individual at the end of follow-up. Evaluations on new data based on the 3 metrics (C-index for recurrent events, Integrated MSE for recurrent events and Integrated Score for recurrent events) will be available soon.
predictions <- predict( trained_forest, newdata = test_bladder1_recforest, id_var = "id", covariates = c("treatment", "number", "size"), time_vars = c("t.start", "t.stop"), death_var = "death" )
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.