mipred: Prediction using multiple imputation

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/mipred.R

Description

Calculates predictions from generalized linear models when multiple imputations are used to account for missing values in predictor data.

Usage

1
2
mipred(formula, family, data, newdata, nimp, folds = NULL,
  method = "averaging", mice.options = NULL)

Arguments

formula

A formula object providing a symbolic description of the prediction model to be fitted.

family

Specification of an appropriate error distribution and link function.

data

A data.frame containing calibration data on n samples. Variables declared in formula must be found in data.

newdata

A data.frame containing the predictors for observations to be predicted on m samples. This must have the same structure and variables as data, except for the outcome variable which is ignored in the construction of the predictions and can therefor be excluded from the object.

nimp

Number of imputations used in the prediction of each observation.

folds

Number of fold-partitions defined within newdata. An integer from 1 to m. Defaults to NULL which internally sets folds=m, which puts each observation in newdata into its own singleton fold. The minimum value folds=1 would predict the entire set newdata in a single step without partitioning.

method

Imputation combination method. This defaults to "averaging" for the prediction-averaging approach. The alternative "rubin" applies the Rubin's rules pooled model.

mice.options

Optional list containing arguments to be supplied to mice. Refer to the mice documentation for details. The following options may be specified: method, predictorMatrix, blocks, visitSequence, formulas, blots, post, defaultMethod, maxit, printFlag, seed, data.init. Please refer to the mice documentation for the description of these options. To set the number of imputations nimp should be used. seed may be specified as a numeric vector of length nimp*folds when method is set to averaging and of length folds when method is set to rubin. Setting seed to a vector will cause each next call to mice to use the next seed value in the vector. Setting the seed to a single numeric value will cause all instances of mice to use that same seed value. If you specify a seed vector of insufficient length then the values will be recycled. The required length is folds*nimp for the averaging approach and length folds for the rubin approach. The defaultMethod is set to c("pmm", "logreg", "polyreg", "polr") by default. The default setting for printFlag is FALSE. The default for maxit is 50. All other options are set to NULL by default.

Value

A list consisting of 3 components, of which the first is the Call and the last two are matrices of predictions as follows.

pred

Matrix of predictions on the scale of the response variable of dimension m by nimp.

linpred

Matrix of predictions on the scale of the linear predictor of dimension m by nimp.

Author(s)

Bart J A Mertens, b.mertens@lumc.nl

References

https://arxiv.org/abs/1810.05099

See Also

mice

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Generate a copy of the cll data and construct binary outcome from survival information
cll_bin<-cll
cll_bin$srv5y_s[cll_bin$srv5y>12] <- 0  # Apply administrative censorship at t=12 months
cll_bin$srv5y[cll_bin$srv5y>12]  <- 12
cll_bin$Status[cll_bin$srv5y_s==1]<- 1  # Define the new binary "Status" outcome variable
cll_bin$Status[cll_bin$srv5y_s==0] <- 0  # As numeric -> 1:Dead, 0:Alive
cll_bin$Censor <- NULL # Remove survival outcomes
cll_bin$srv5y <- NULL
cll_bin$srv5y_s <- NULL

# Predict observations 501 to 504 using the first 100 records to calibrate predictors
# Remove the identification variable before prediction calibration and imputation.
# Remove outcome for new observations
# Apply prediction-averaging using 5 imputations, set mice option maxit=5.
# Note these settings are only for illustration and should be set to higher values for
# practical use, particularly for nimp.
output<-mipred(Status ~ age10+cyto, family=binomial, data=cll_bin[1:100,-1],
  newdata=cll_bin[501:504,c(-1,-10)], nimp=5, mice.options=list(maxit=5))

mipred documentation built on July 12, 2019, 5:04 p.m.