pec2: Clone of pec::pec function

Description Usage Arguments

Description

Slightly modified such that start/stop formulas are accepted and processed.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
pec2(
  object,
  formula,
  data,
  traindata,
  times,
  cause,
  start,
  maxtime,
  exact = TRUE,
  exactness = 100,
  fillChar = NA,
  cens.model = "cox",
  ipcw.refit = FALSE,
  ipcw.args = NULL,
  splitMethod = "none",
  B,
  M,
  reference = TRUE,
  model.args = NULL,
  model.parms = NULL,
  keep.index = FALSE,
  keep.matrix = FALSE,
  keep.models = FALSE,
  keep.residuals = FALSE,
  keep.pvalues = FALSE,
  noinf.permute = FALSE,
  multiSplitTest = FALSE,
  testIBS,
  testTimes,
  confInt = FALSE,
  confLevel = 0.95,
  verbose = TRUE,
  savePath = NULL,
  slaveseed = NULL,
  na.action = na.fail,
  ...
)

Arguments

object

A named list of prediction models, where allowed entries are (1) R-objects for which a predictSurvProb method exists (see details), (2) a call that evaluates to such an R-object (see examples), (3) a matrix with predicted probabilities having as many rows as data and as many columns as times. For cross-validation all objects in this list must include their call.

formula

A survival formula as obtained either with prodlim::Hist or survival::Surv. The left hand side is used to find the status response variable in data. For right censored data, the right hand side of the formula is used to specify conditional censoring models. For example, set Surv(time,status)~x1+x2 and cens.model="cox". Then the weights are based on a Cox regression model for the censoring times with predictors x1 and x2. Note that the usual coding is assumed: status=0 for censored times and that each variable name that appears in formula must be the column name in data. If there are no covariates, i.e. formula=Surv(time,status)~1 the cens.model is coerced to "marginal" and the Kaplan-Meier estimator for the censoring times is used to calculate the weights. If formula is missing, try to extract a formula from the first element in object.

data

A data frame in which to validate the prediction models and to fit the censoring model. If data is missing, try to extract a data set from the first element in object.

traindata

A data frame in which the models are trained. This argument is used only in the absence of crossvalidation, in which case it is passed to the predictHandler function predictSurvProb

times

A vector of time points. At each time point the prediction error curves are estimated. If exact==TRUE the times are merged with all the unique values of the response variable. If times is missing and exact==TRUE all the unique values of the response variable are used. If missing and exact==FALSE use a equidistant grid of values between start and maxtime. The distance is determined by exactness.

cause

For competing risks, the event of interest. Defaults to the first state of the response, which is obtained by evaluating the left hand side of formula in data.

start

Minimal time for estimating the prediction error curves. If missing and formula defines a Surv or Hist object then start defaults to 0, otherwise to the smallest observed value of the response variable. start is ignored if times are given.

maxtime

Maximal time for estimating the prediction error curves. If missing the largest value of the response variable is used.

exact

Logical. If TRUE estimate the prediction error curves at all the unique values of the response variable. If times are given and exact=TRUE then the times are merged with the unique values of the response variable.

exactness

An integer that determines how many equidistant gridpoints are used between start and maxtime. The default is 100.

fillChar

Symbol used to fill-in places where the values of the prediction error curves are not available. The default is NA.

cens.model

Method for estimating inverse probability of censoring weigths:

cox: A semi-parametric Cox proportional hazard model is fitted to the censoring times

marginal: The Kaplan-Meier estimator for the censoring times

nonpar: Nonparametric extension of the Kaplan-Meier for the censoring times using symmetric nearest neighborhoods – available for arbitrary many strata variables on the right hand side of argument formula but at most one continuous variable. See the documentation of the functions prodlim and neighborhood from the prodlim package.

aalen: The nonparametric Aalen additive model fitted to the censoring times. Requires the timereg package.

ipcw.refit

If TRUE the inverse probability of censoring weigths are estimated separately in each training set during cross-validation.

ipcw.args

List of arguments passed to function specified by argument cens.model.

splitMethod

SplitMethod for estimating the prediction error curves.

none/noPlan: Assess the models in the same data where they are fitted. boot: DEPRECIATED.

cvK: K-fold cross-validation, i.e. cv10 for 10-fold cross-validation. After splitting the data in K subsets, the prediction models (ie those specified in object) are evaluated on the data omitting the Kth subset (training step). The prediction error is estimated with the Kth subset (validation step).

The random splitting is repeated B times and the estimated prediction error curves are obtained by averaging.

BootCv: Bootstrap cross validation. The prediction models are trained on B bootstrap samples, that are either drawn with replacement of the same size as the original data or without replacement from data of the size M. The models are assessed in the observations that are NOT in the bootstrap sample.

Boot632: Linear combination of AppErr and BootCvErr using the constant weight .632.

Boot632plus: Linear combination of AppErr and BootCv using weights dependent on how the models perform in permuted data.

loocv: Leave one out cross-validation.

NoInf: Assess the models in permuted data.

B

Number of bootstrap samples. The default depends on argument splitMethod. When splitMethod in c("BootCv","Boot632","Boot632plus") the default is 100. For splitMethod="cvK" B is the number of cross-validation cycles, and – default is 1. For splitMethod="none" B is the number of bootstrap simulations e.g. to obtain bootstrap confidence limits – default is 0.

M

The size of the bootstrap samples for resampling without replacement. Ignored for resampling with replacement.

reference

Logical. If TRUE add the marginal Kaplan-Meier prediction model as a reference to the list of models.

model.args

List of extra arguments that can be passed to the predictSurvProb methods. The list must have an entry for each entry in object.

model.parms

Experimental. List of with exactly one entry for each entry in object. Each entry names parts of the value of the fitted models that should be extracted and added to the value.

keep.index

Logical. If FALSE remove the bootstrap or cross-validation index from the output list which otherwise is included in the splitMethod part of the output list.

keep.matrix

Logical. If TRUE add all B prediction error curves from bootstrapping or cross-validation to the output.

keep.models

Logical. If TRUE keep the models in object. Since fitted models can be large objects the default is FALSE.

keep.residuals

Logical. If TRUE keep the patient individual residuals at testTimes.

keep.pvalues

For multiSplitTest. If TRUE keep the pvalues from the single splits.

noinf.permute

If TRUE the noinformation error is approximated using permutation.

multiSplitTest

If TRUE the test proposed by van de Wiel et al. (2009) is applied. Requires subsampling bootstrap cross-validation, i.e. that splitMethod equals bootcv and that M is specified.

testIBS

A range of time points for testing differences between models in the integrated Brier scores.

testTimes

A vector of time points for testing differences between models in the time-point specific Brier scores.

confInt

Experimental.

confLevel

Experimental.

verbose

if TRUE report details of the progress, e.g. count the steps in cross-validation.

savePath

Place in your file system (i.e., a directory on your computer) where training models fitted during cross-validation are saved. If missing training models are not saved.

slaveseed

Vector of seeds, as long as B, to be given to the slaves in parallel computing.

na.action

Passed immediately to model.frame. Defaults to na.fail. If set otherwise most prediction models will not work.

...

Not used.


adibender/ldatools documentation built on March 7, 2020, 5:30 a.m.