hdlandmark: Compute individual dynamic prediction of clinical endpoint...

View source: R/hdlandmark.R

hdlandmarkR Documentation

Compute individual dynamic prediction of clinical endpoint using large dimensional longitudinal biomarker history

Description

hdlandmark provides individual survival probabilities using covariates and summaries build on longitudinal data from biomarkers collected over the time. For each biomarker, an ensemble of predictive summaries are computed at the user-specified landmark time tLM. For instance, we use random-effects, level, slope and cumulative level. Then, these summaries and covariates are used as input in several survival prediction methods including: Cox model (his extension with penalty), sparse-Partial Least Square for survival data and random survival forests For each survival prediction method, we provide the individual prediction on horizon time tHor.

Usage

hdlandmark(
  data,
  data.pred = NULL,
  markers,
  tLMs,
  tHors,
  subject,
  time,
  time.event,
  event,
  long.method = c("combine", "GLMM", "MFPC"),
  surv.covar = c("baseline", "LOtLM"),
  cox.submodels = c("autoVar", "allVar"),
  coxnet.submodels = c("opt", "lasso", "ridge"),
  penaFG.submodels = c("GCV", "BIC"),
  spls.submodels = c("opt", "nosparse", "maxsparse"),
  rsf.submodels = c("opt", "noVS", "default"),
  rsf.split = c("logrank", "bs.gradient"),
  cause = 1,
  HW = NULL,
  summaries = c("RE", "score", "pred", "slope", "cumulative"),
  kfolds = 1,
  seed = 1234,
  scaling = FALSE,
  SL.weights = NULL,
  nodesize.grid = NULL,
  mtry.grid = NULL
)

Arguments

data

data.frame object containing longitudinal and survival data

data.pred

(optional) data.frame object for predictions. If missing, data.pred is made using kfolds cross-validation

markers

list containing the modeling of repeated measures for each marker

tLMs

numeric vector of landmark times

tHors

numeric vector of horizon times

subject

variable name in data (and data.pred) that identifies the different subjects

time

variable name in data (and data.pred) which contains time measurements

time.event

variable name in data (and data.pred) which contains time-to-event

event

variable name in data (and data.pred) which contains time-to-event

long.method

character that specifies how to model the longitudinal data. Choices are GLMM for generalized mixed model \insertCitelaird_random-effects_1982hdlandmark, MFPC for multivariate functional principal components \insertCiteyao_functional_2005hdlandmark (works only on continuous markers) or combine for both.

surv.covar

covariates measure at baseline or last observation before landmark time LOtLM

cox.submodels

a character vector containing Cox submodels \insertCitecox_regression_1972hdlandmark. autoVar for Cox with backward variable selection. allVar for Cox with all variables

coxnet.submodels

a character vector containing penalized Cox submodels \insertCitesimon_regularization_2011hdlandmark. opt for tuning the elastic net parameter penalty, lasso for lasso penalty and ridge for ridge penalty.

spls.submodels

a character vector containing Deviance residuals sparse-Partial Least Square sub-methods \insertCitebastien_deviance_2015hdlandmark. opt for tuning sparcity parameter η, nosparse for η = 0 and maxsparse for η = 0.9 \insertCite@see also @chun_sparse_2010hdlandmark

rsf.submodels

a character vector containing random survival forests sub-methods \insertCiteishwaran_random_2008hdlandmark.

rsf.split

a character vector containing the split criterion for random survival forests sub-methods. logrank for log-rank splitting or bs.gradient for gradient-based brier score splitting.

kfolds

number of fold in cross-validation

seed

(optional) seed number

scaling

boolean to scale summaries (default is FALSE)

SL.weights

(optional) allow to compute individual probabilities from a superlearner using numeric vector of weights for each sub-methods

Value

tLMs

landmark time(s)

tHors

horizon time(s)

models

a list for each landmark time(s):

  • data.surv input data in survival methods for training (only available on the last fold)

  • data.surv.pred input data in survival methods for predicting (only available on the last fold)

  • model.surv output object for the selected survival predictive methods (only available on the last fold)

  • pred.surv for each horizon time(s), containing the individual probabilities for the selected survival predictive methods

  • AUC list of horizon time(s) containing AUC for each fold for the selected survival predictive methods

  • BS list of horizon time(s) containing BS for each fold for the selected survival predictive methods

long.method

method(s) used to modeling the biomarkers

surv.methods

method(s) used to compute the individual survival prediction

models.name

name of survival prediction methods

kfolds

number of folds

Author(s)

Anthony Devaux (anthony.devaux@u-bordeaux.fr) (maintener), Robin Genuer and Cécile Proust-Lima

References

\insertAllCited

Examples


## Not run: 

data(pbc2)

# Formula for the modeling of the biomarkers using splines
serBilir = list(model = list(fixed = serBilir ~ year,
                             random = ~ year,
                             subject = "id"),
                deriv = list(fixed = ~ 1,
                             indFixed = 2,
                             random = ~ 1,
                             indRandom = 2))

serChol = list(model = list(fixed = serChol ~ year + I(year^2),
                            random = ~ year + I(year^2),
                            subject = "id"),
               deriv = list(fixed = ~ I(2*year),
                            indFixed = c(2,3),
                            random = ~ I(2*year),
                            indRandom = c(2,3)))

albumin = list(model = list(fixed = albumin ~ year,
                            random = ~ year,
                            subject = "id"),
               deriv = list(fixed = ~ 1,
                            indFixed = 2,
                            random = ~ 1,
                            indRandom = 2))

alkaline = list(model = list(fixed = alkaline ~ year,
                             random = ~ year,
                             subject = "id"),
                deriv = list(fixed = ~ 1,
                             indFixed = 2,
                             random = ~ 1,
                             indRandom = 2))

SGOT = list(model = list(fixed = SGOT ~ year,
                         random = ~ year,
                         subject = "id"),
            deriv = list(fixed = ~ 1,
                         indFixed = 2,
                         random = ~ 1,
                         indRandom = 2))

platelets = list(model = list(fixed = platelets ~ year + I(year^2),
                              random = ~ year + I(year^2),
                              subject = "id"),
                 deriv = list(fixed = ~ I(2*year),
                              indFixed = c(2,3),
                              random = ~ I(2*year),
                              indRandom = c(2,3)))

prothrombin = list(model = list(fixed = prothrombin ~ year,
                                random = ~ year,
                                subject = "id"),
                   deriv = list(fixed = ~ 1,
                                indFixed = 2,
                                random = ~ 1,
                                indRandom = 2))

ascites = list(model = ascites ~ year + (1 + year|id),
               deriv = list(fixed = ~ 1,
                            indFixed = 2,
                            random = ~ 1,
                            indRandom = 2))

hepatomegaly = list(model = hepatomegaly ~ year + (1 + year|id),
                    deriv = list(fixed = ~ 1,
                                 indFixed = 2,
                                 random = ~ 1,
                                 indRandom = 2))

spiders = list(model = spiders ~ year + (1 + year|id),
               deriv = list(fixed = ~ 1,
                            indFixed = 2,
                            random = ~ 1,
                            indRandom = 2))

edema = list(model = edema ~ year + (1 + year|id),
             deriv = list(fixed = ~ 1,
                          indFixed = 2,
                          random = ~ 1,
                          indRandom = 2))

marker <- list(serBilir = serBilir, serChol = serChol, albumin = albumin,
               alkaline = alkaline, SGOT = SGOT, platelets = platelets,
               prothrombin = prothrombin, ascites = ascites,
               hepatomegaly = hepatomegaly, spiders = spiders,
               edema = edema)

# compute hdlandmark methodology
hdlandmark.res <- hdlandmark(data = pbc2, data.pred = pbc2, markers = marker,
                             tLMs = 4, tHors = 3,
                             subject = "id", time = "year", time.event = "years", event = "status2",
                             long.method = "GLMM", surv.covar = "baseline",
                             cox.submodels = "allVar",
                             coxnet.submodels = "lasso",
                             spls.submodels = "nosparse",
                             rsf.submodels = "default",
                             rsf.split = c("logrank"))

# get individual predictions for each method
hdlandmark.res$models[[`4`]]$pred.surv$`3`

## End(Not run)


anthonydevaux/hdlandmark documentation built on Jan. 11, 2023, 8:01 a.m.