Z(log) Transformation for Laboratory Measurements

Introduction

The zlog package offers functions to transform laboratory measurements into standardised $z$ or $z(log)$-values as suggested in @hoffmann2017. Therefore the lower and upper reference limits are needed. If these are not known they could estimated from a given sample.

Z or Z(log) Transformation

@hoffmann2017 define $z$ as follows:

$$z = (x - (limits_1 + limits_2 )/2) * 3.92/(limits_2 - limits_1)$$

Consequently the $z(log)$ is defined as:

$$zlog = (\log(x) - (\log(limits_1) + \log(limits_2))/2) * 3.92/(\log(limits_2) - \log(limits_1))$$

Where $x$ is the measured laboratory value and $limits_1$ and $limits_2$ are the lower and upper reference limit, respectively.

Example data and reference limits are taken from @hoffmann2017, Table 2.

library("zlog")

albumin <- c(42, 34, 38, 43, 50, 42, 27, 31, 24)
z(albumin, limits = c(35, 52))
zlog(albumin, limits = c(35, 52))

Inverse Z or Z(log) Transformation/Undo Transformation

izlog(zlog(albumin, limits = c(35, 52)), limits = c(35, 52))

Z(log) Dependent Colour Gradient

@hoffmann2017 suggested a colour gradient to visualise laboratory measurements for the user.

z <- -10:10
oldpar <- par(mar = c(0, 0, 0, 0), oma = c(0, 0, 0, 0))
image(matrix(z, ncol = 1), col = zcol(z), axes = FALSE)
text(seq(0, 1, length.out=length(z)), 0, label = z)
par(oldpar)

It could be used to highlight the values in a table:

bilirubin <- c(11, 9, 2, 5, 22, 42, 37, 200, 20)
zloga <- zlog(albumin, limits = c(35, 52))
zlogb <- zlog(bilirubin, limits = c(2, 21))
d <- data.frame(
    Category = c(
        rep(c(
            "blood donor",
            "hepatitis without cirrhosis",
            "hepatitis with cirrhosis"
            ),
            each = 3
        )
    ),
    albumin = albumin,
    zloga = zloga,
    bilirubin = bilirubin,
    zlogb = zlogb
)
d$albumin <- kableExtra::cell_spec(
    d$albumin, background = zcol(zloga), align = "right"
)
d$bilirubin <- kableExtra::cell_spec(
    d$bilirubin, background = zcol(zlogb), align = "right"
)
kableExtra::kable_classic(
    kableExtra::kbl(
        d,
        col.names = c(
            "Category",
            "albumin", "zlog(albumin)", "bilirubin", "zlog(bilirubin)"
        ),
        digits = 2,
        escape = FALSE,
        caption = paste0(
            "Table reproduced from @hoffmann2017, Table 2, limits used: ",
            "albumin 35-52 g/l, bilirubin 2-21 µmol/l."
        )
    ),
    "basic"
)

Estimate Reference Limits

The reference_limits functions calculates the lower and upper 2.5 or 97.5 (or a user given probability) quantiles:

reference_limits(albumin)
reference_limits(albumin, probs = c(0.05, 0.95))

exp(reference_limits(log(albumin)))

Working with Reference Tables

Most laboratories use their own age- and sex-specific reference limits. The lookup_limits function could be used to find the correct reference limit.

# toy example
reference <- data.frame(
    param = c("albumin", rep("bilirubin", 4)),
    age = c(0, 1, 2, 3, 7),     # days
    sex = "both",
    units = c("g/l", rep("µmol/l", 4)), # ignored
    lower = c(35, rep(NA, 4)),  # no real reference values
    upper = c(52, 5, 8, 13, 18) # no real reference values
)
knitr::kable(reference)

# lookup albumin reference values for 18 year old woman
lookup_limits(
    age = 18 * 365.25,
    sex = "female",
    table = reference[reference$param == "albumin",]
)

# lookup albumin and bilirubin values for 18 year old woman
lookup_limits(
    age = 18 * 365.25,
    sex = "female",
    table = reference
)

# lookup bilirubin reference values for infants
lookup_limits(
    age = 0:8,
    sex = rep(c("female", "male"), 5:4),
    table = reference[reference$param == "bilirubin",]
)

Missing Reference Limits

Sometimes reference limits are not specified. That is often the case for biomarkers that are related to infection or cancer. Using zero as lower boundary results in skewed distributions [@hoffmann2017, fig. 7]. @haeckel2015 suggested to set the lower reference limit to 15 % of the upper one.

# use default fractions
set_missing_limits(reference)

# set fractions manually
set_missing_limits(reference, fraction = c(0.2, 5))

Impute Missing Laboratory Measurements

If laboratory measurements are missing they could be imputed using "normal" values from the reference table. Using the "logmean" (default) or "mean" reference value (default) will result in a $zlog$ or $z$-value of zero, respectively.

x <- data.frame(
    age = c(40, 50),
    sex = c("female", "male"),
    albumin = c(42, NA)
)
x
z_df(impute_df(x, reference, method = "mean"), reference)
zlog_df(impute_df(x, reference), reference)

PBC Example

For demonstration we choose the pbc dataset from the survival package and exclude all non-laboratory measurements except age and sex:

library("survival")
data("pbc")
labs <- c(
    "bili", "chol", "albumin", "copper", "alk.phos", "ast", "trig",
    "platelet", "protime"
)
pbc <- pbc[, c("age", "sex", labs)]
knitr::kable(head(pbc), digits = 1)

Next we estimate all reference limits from the data. We want to use sex-specific values for copper and aspartate aminotransferase ("ast").

## replicate copper and ast 2 times, use the others just once
param <- rep(labs, ifelse(labs %in% c("copper", "ast"), 2, 1))
sex <- rep_len("both", length(param))

## replace sex == both with female and male for copper and ast
sex[param %in% c("copper", "ast")] <- c("f", "m")

## create data.frame, we ignore age-specific values for now and set age to zero
## (means applicable for all ages)
reference <- data.frame(
    param = param, age = 0, sex = sex, lower = NA, upper = NA
)

## estimate reference limits from sample data
for (i in seq_len(nrow(reference))) {
    reference[i, c("lower", "upper")] <-
        if (reference$sex[i] == "both")
            reference_limits(pbc[reference$param[i]])
        else
            reference_limits(pbc[pbc$sex == reference$sex[i], reference$param[i]])
}
knitr::kable(reference)

The pbc dataset contains a few missing values. We impute the with the corresponding mean reference value (which is in this example just the sample mean but would be in real life the mean of a e.g. healthy subpopulation).

pbc[c(6, 14),]
pbc <- impute_df(pbc, reference)
pbc[c(6, 14),]

Subsequently we can convert the laboratory measurements into $z(log)$-values using the zlog_df function that applies the zlog for every numeric column in a data.frame (except the "age" column):

pbc <- zlog_df(pbc, reference)
pbctbl <- head(pbc, n = 25)
pbctbl[labs] <- lapply(labs, function(l) {
    kableExtra::cell_spec(
        sprintf("%.1f", unlist(pbctbl[l])),
        background = zcol(unlist(pbctbl[l])),
        align = "right"
    )
})

kableExtra::kable_classic(
    kableExtra::kbl(pbctbl, digits = 1, escape = FALSE),
    "basic"
)

Session information

sessionInfo()

References



Try the zlog package in your browser

Any scripts or data that you put into this service are public.

zlog documentation built on Jan. 6, 2023, 1:25 a.m.