prop_usable_cases: Calculates variable-wise proportion of usable cases (missing...

View source: R/prop_usable_cases.R

prop_usable_casesR Documentation

Calculates variable-wise proportion of usable cases (missing and observed)

Description

Calculates variable-wise proportion of usable cases (missing and observed) as in Molenberghs et al. (2014).

Usage

prop_usable_cases(data)

Arguments

data

dataframe to be imputed

Details

missForest builds models for each variable using the observed values of that variable as outcome of a random forest model. It then imputes the missing part of the variable using the learned models.

If all values of a predictor are missing among the observed value of the outcome, the value of p_obs will be 1 and the model built will rely heavily on the initialized values. If all values of a predictor are observed among the observed values of the outcome, p_obs will be 0 and the model will rely on observed values. Low values of p_obs are preferred.

Similarly, if all values of a predictor are missing among the missing values of the outcome, p_miss will have a value of 0 and the imputations (predictions) will heavily rely on the initialized values. If all values of a predictor are observed among the missing value of the outcome, p_miss will have a value of 1 and the imputations (predictions) will rely on real values. High values of p_miss are preferred.

Each row represents a variable to be imputed and each column a predictor.

Value

a list with two elements: p_obs and p_miss

p_obs

the proportion of missing Y_k among observed Y_j; j in rows, k in columns

p_miss

the proportion of observed Y_k among missing Y_j; j in rows, k in columns

References

  • Molenberghs, G., Fitzmaurice, G., Kenward, M. G., Tsiatis, A., & Verbeke, G. (Eds.). (2014). Handbook of missing data methodology. CRC Press. Chapter "Multiple Imputation"


missForestPredict documentation built on May 29, 2024, 7:26 a.m.