conditional_multiple_imputation: Conditional multiple imputation

View source: R/conditional_multiple_imputation.R

conditional_multiple_imputationR Documentation

Conditional multiple imputation

Description

First two steps for multiple imputation for censored covariates. Returns regression fits in a list that can be combined using pool().

Usage

conditional_multiple_imputation(
  data,
  formula,
  regression_type = c("lm", "glm", "glmer"),
  mi_reps = 10,
  imputation_method = c("km", "km_exp", "km_wei", "km_os", "rs", "mrl", "cc", "pmm"),
  weights = NULL,
  contrasts = NULL,
  family = "binomial",
  id = NULL,
  verbose = FALSE,
  n_obs_min = 2
)

Arguments

data

'data.frame'

formula

the formula for fitting the regression model with a special syntax for the censored covariate : e.g. 'y~Surv(x,I)' means 'y~x' with 'x' being censored and 'I' the event indicator (0=censored,1=observed).

regression_type

function. The regression type to be used, lm for linear regression, glm for general linear regression, glmer for generalized linear mixed-effects models. Default: lm

mi_reps

number of repetitions for multiple imputation. Default: 10

imputation_method

which method should be used in the imputation step. One of 'km','km_exp','km_wei','km_os', 'rs', 'mrl', 'cc', 'pmm'. See details. default = 'km'.

weights

Weights to be used in fitting the regression model. Default = NULL

contrasts

Contrast vector to be used in testing the regression model. Default = NULL

family

The family to be used in the regression model. Default = "binomial". Omitted if linear model is used.

id

name of column containing id of sample

verbose

Logical.

n_obs_min

minimum number of observed events needed. default = 2. if lower than this value will throw an error.

Details

Possible methods in 'imputation_method' are:

'km'

Kaplan Meier imputation is similar to 'rs' (Risk set imputation) but the random draw is according to the survival function of the respective risk set.

'km_exp'

The same as 'km' but if the largest value is censored the tail of the survival function is modeled as an exponential distribution where the rate parameter is obtained by fixing the distribution to the last observed value. See (Moeschberger and Klein, 1985).

'km_wei'

The same as 'km' but if the largest value is censored the tail of the survival function is modeled as an weibull distribution where the parameters are obtained by MLE fitting on the whole data. See (Moeschberger and Klein, 1985).

'km_os'

The same as 'km' but if the largest value is censored the tail of the survival function is modeled by order statistics. See (Moeschberger and Klein, 1985).

'rs'

Risk Set imputation replaces the censored values with a random draw from the risk set of the respective censored value.

'mrl'

Mean Residual Life (Conditional single imputation from Atem et al. 2017) is a multiple imputation procedure that bootstraps the data and imputes the censored values by replacing them with their respective mean residual life.

'cc'

complete case (listwise deletion) analysis removes incomlete samples.

'pmm'

predictive mean matching treats censored values as missing and uses predictive mean matching method from mice.

Value

A list with five elements:

'data'

The input data frame

'betasMean'

the mean regression coefficients

'betasVar'

the variances of the mean regression coefficients

'metadata'

a list of three elements:

'mi_reps'

number of repetitions in multiple imputation

'betas'

all regression coefficients

'vars'

the variances of the regression coefficients

'fits'

list with all regression fits

References

A Comparison of Several Methods of Estimating the Survival Function When There is Extreme Right Censoring (M. L. Moeschberger and John P. Klein, 1985)

Examples

 # define association
 lm_formula <- formula(Y ~ Surv(X,I) + Z)
 # simulate data
 data <- simulate_singlecluster(100, lm_formula, type = "lm", n_levels_fixeff=2)
 # run fitting
 cmi_out <- conditional_multiple_imputation(data,lm_formula)
 # pool fits
 comb_out <- mice::pool(cmi_out$fits)
 # result
 pvals <- summary(comb_out)$p.value
 

retogerber/censcyt documentation built on Feb. 7, 2023, 9:56 a.m.