mipred: Prediction using multiple imputation
In mipred: Prediction using Multiple Imputation

Description Usage Arguments Value Author(s) References See Also Examples

Calculates predictions from generalized linear models when multiple imputations are used to account for missing values in predictor data.

1 2	mipred(formula, family, data, newdata, nimp, folds = NULL, method = "averaging", mice.options = NULL)

`formula`	A formula object providing a symbolic description of the prediction model to be fitted.
`family`	Specification of an appropriate error distribution and link function.
`data`	A data.frame containing calibration data on `n` samples. Variables declared in `formula` must be found in `data`.
`newdata`	A data.frame containing the predictors for observations to be predicted on `m` samples. This must have the same structure and variables as `data`, except for the outcome variable which is ignored in the construction of the predictions and can therefor be excluded from the object.
`nimp`	Number of imputations used in the prediction of each observation.
`folds`	Number of fold-partitions defined within `newdata`. An integer from 1 to `m`. Defaults to NULL which internally sets `folds=m`, which puts each observation in `newdata` into its own singleton fold. The minimum value `folds=1` would predict the entire set `newdata` in a single step without partitioning.
`method`	Imputation combination method. This defaults to `"averaging"` for the prediction-averaging approach. The alternative `"rubin"` applies the Rubin's rules pooled model.
`mice.options`	Optional list containing arguments to be supplied to `mice`. Refer to the `mice` documentation for details. The following options may be specified: `method`, `predictorMatrix`, `blocks`, `visitSequence`, `formulas`, `blots`, `post`, `defaultMethod`, `maxit`, `printFlag`, `seed`, `data.init`. Please refer to the `mice` documentation for the description of these options. To set the number of imputations `nimp` should be used. `seed` may be specified as a numeric vector of length `nimpfolds` when `method` is set to `averaging` and of length `folds` when `method` is set to `rubin`. Setting `seed` to a vector will cause each next call to `mice` to use the next seed value in the vector. Setting the seed to a single numeric value will cause all instances of mice to use that same seed value. If you specify a seed vector of insufficient length then the values will be recycled. The required length is `foldsnimp` for the averaging approach and length `folds` for the rubin approach. The `defaultMethod` is set to `c("pmm", "logreg", "polyreg", "polr")` by default. The default setting for `printFlag` is FALSE. The default for `maxit` is 50. All other options are set to `NULL` by default.

A list consisting of 3 components, of which the first is the Call and the last two are matrices of predictions as follows.

pred: Matrix of predictions on the scale of the response variable of dimension m by nimp.
linpred: Matrix of predictions on the scale of the linear predictor of dimension m by nimp.

Bart J A Mertens, b.mertens@lumc.nl

https://arxiv.org/abs/1810.05099

mice

# Generate a copy of the cll data and construct binary outcome from survival information
cll_bin<-cll
cll_bin$srv5y_s[cll_bin$srv5y>12] <- 0  # Apply administrative censorship at t=12 months
cll_bin$srv5y[cll_bin$srv5y>12]  <- 12
cll_bin$Status[cll_bin$srv5y_s==1]<- 1  # Define the new binary "Status" outcome variable
cll_bin$Status[cll_bin$srv5y_s==0] <- 0  # As numeric -> 1:Dead, 0:Alive
cll_bin$Censor <- NULL # Remove survival outcomes
cll_bin$srv5y <- NULL
cll_bin$srv5y_s <- NULL

# Predict observations 501 to 504 using the first 100 records to calibrate predictors
# Remove the identification variable before prediction calibration and imputation.
# Remove outcome for new observations
# Apply prediction-averaging using 5 imputations, set mice option maxit=5.
# Note these settings are only for illustration and should be set to higher values for
# practical use, particularly for nimp.
output<-mipred(Status ~ age10+cyto, family=binomial, data=cll_bin[1:100,-1],
  newdata=cll_bin[501:504,c(-1,-10)], nimp=5, mice.options=list(maxit=5))