amr-tidymodels: AMR Extensions for Tidymodels
In msberends/AMR: Antimicrobial Resistance Data Analysis

amr-tidymodels

R Documentation

AMR Extensions for Tidymodels

Description

This family of functions allows using AMR-specific data types such as ⁠<mic>⁠ and ⁠<sir>⁠ inside tidymodels pipelines.

Usage

all_mic()

all_mic_predictors()

all_sir()

all_sir_predictors()

step_mic_log2(recipe, ..., role = NA, trained = FALSE, columns = NULL,
  skip = FALSE, id = recipes::rand_id("mic_log2"))

step_sir_numeric(recipe, ..., role = NA, trained = FALSE, columns = NULL,
  skip = FALSE, id = recipes::rand_id("sir_numeric"))

Arguments

`recipe`	A recipe object. The step will be added to the sequence of operations for this recipe.
`...`	One or more selector functions to choose variables for this step. See `selections()` for more details.
`role`	Not used by this step since no new variables are created.
`trained`	A logical to indicate if the quantities for preprocessing have been estimated.
`skip`	A logical. Should the step be skipped when the recipe is baked by `bake()`? While all operations are baked when `prep()` is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using `skip = TRUE` as it may affect the computations for subsequent operations.
`id`	A character string that is unique to this step to identify it.

Details

You can read more in our online AMR with tidymodels introduction.

Tidyselect helpers include:

all_mic() and all_mic_predictors() to select ⁠<mic>⁠ columns
all_sir() and all_sir_predictors() to select ⁠<sir>⁠ columns

Pre-processing pipeline steps include:

step_mic_log2() to convert MIC columns to numeric (via as.numeric()) and apply a log2 transform, to be used with all_mic_predictors()
step_sir_numeric() to convert SIR columns to numeric (via as.numeric()), to be used with all_sir_predictors(): "S" = 1, "I"/"SDD" = 2, "R" = 3. All other values are rendered NA. Keep this in mind for further processing, especially if the model does not allow for NA values.

These steps integrate with recipes::recipe() and work like standard preprocessing steps. They are useful for preparing data for modelling, especially with classification models.

Examples

library(tidymodels)

# The below approach formed the basis for this paper: DOI 10.3389/fmicb.2025.1582703
# Presence of ESBL genes was predicted based on raw MIC values.


# example data set in the AMR package
esbl_isolates

# Prepare a binary outcome and convert to ordered factor
data <- esbl_isolates %>%
  mutate(esbl = factor(esbl, levels = c(FALSE, TRUE), ordered = TRUE))

# Split into training and testing sets
split <- initial_split(data)
training_data <- training(split)
testing_data <- testing(split)

# Create and prep a recipe with MIC log2 transformation
mic_recipe <- recipe(esbl ~ ., data = training_data) %>%
  # Optionally remove non-predictive variables
  remove_role(genus, old_role = "predictor") %>%
  # Apply the log2 transformation to all MIC predictors
  step_mic_log2(all_mic_predictors()) %>%
  prep()

# View prepped recipe
mic_recipe

# Apply the recipe to training and testing data
out_training <- bake(mic_recipe, new_data = NULL)
out_testing <- bake(mic_recipe, new_data = testing_data)

# Fit a logistic regression model
fitted <- logistic_reg(mode = "classification") %>%
  set_engine("glm") %>%
  fit(esbl ~ ., data = out_training)

# Generate predictions on the test set
predictions <- predict(fitted, out_testing) %>%
  bind_cols(out_testing)

# Evaluate predictions using standard classification metrics
our_metrics <- metric_set(accuracy, kap, ppv, npv)
metrics <- our_metrics(predictions, truth = esbl, estimate = .pred_class)

# Show performance:
# - negative predictive value (NPV) of ~98%
# - positive predictive value (PPV) of ~94%
metrics

msberends/AMR documentation built on June 14, 2025, 7:58 a.m.