impute_df: Imputation

View source: R/impute.R

impute_dfR Documentation

Imputation

Description

Impute NA values with the logmean, mean, minimal or maximum reference value.

Usage

impute_df(x, limits, method = c("logmean", "mean", "min", "max"))

Arguments

x

data.frame, with the columns: "age", numeric, "sex", factor and more user defined numeric columns that should be imputed.

limits

data.frame, reference table, has to have the columns: "age", numeric (same units as in age, e.g. days or years, age of 0 matches all ages), "sex", factor (same levels for male and female as sex and a special level "both"), "param", character with the laboratory parameter name that have to match the column name in x, "lower" and "upper", numeric for the lower and upper reference limits.

method

character, imputation method. method = "logmean" (default) replaces all NA with its corresponding logged mean values for the reference table limits (for subsequent use of the zlog score, use method = "mean" for *z* score calculation). For method = "min"ormethod = "max"' the lower or the upper limits are used.

Value

data.frame, the same as x but missing values are replaced by the corresponding logmean, mean, minimal or maximal reference values depending on the chosen method.

Note

Imputation should be done prior to z()/zlog() transformation. Afterwards the NA could replaced by zero (for mean-imputation) via d[is.na(d)] <- 0.

Author(s)

Sebastian Gibb

Examples

l <- data.frame(
    param = c("alb", "bili"),
    age = c(0, 0),
    sex = c("both", "both"),
    units = c("mg/l", "µmol/l"),
    lower = c(35, 2),
    upper = c(52, 21)
)
x <- data.frame(
    age = 40:48,
    sex = rep(c("female", "male"), c(5, 4)),
    # from Hoffmann et al. 2017
    alb = c(42, NA, 38, NA, 50, 42, 27, 31, 24),
    bili = c(11, 9, NA, NA, 22, 42, NA, 200, 20)
)
impute_df(x, l)
impute_df(x, l, method = "min")
zlog_df(impute_df(x, l), l)

zlog documentation built on Jan. 6, 2023, 1:25 a.m.