amr-tidymodels | R Documentation |
This family of functions allows using AMR-specific data types such as <mic>
and <sir>
inside tidymodels
pipelines.
all_mic()
all_mic_predictors()
all_sir()
all_sir_predictors()
step_mic_log2(recipe, ..., role = NA, trained = FALSE, columns = NULL,
skip = FALSE, id = recipes::rand_id("mic_log2"))
step_sir_numeric(recipe, ..., role = NA, trained = FALSE, columns = NULL,
skip = FALSE, id = recipes::rand_id("sir_numeric"))
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose variables for this step.
See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
skip |
A logical. Should the step be skipped when the recipe is baked by
|
id |
A character string that is unique to this step to identify it. |
You can read more in our online AMR with tidymodels introduction.
Tidyselect helpers include:
all_mic()
and all_mic_predictors()
to select <mic>
columns
all_sir()
and all_sir_predictors()
to select <sir>
columns
Pre-processing pipeline steps include:
step_mic_log2()
to convert MIC columns to numeric (via as.numeric()
) and apply a log2 transform, to be used with all_mic_predictors()
step_sir_numeric()
to convert SIR columns to numeric (via as.numeric()
), to be used with all_sir_predictors()
: "S"
= 1, "I"
/"SDD"
= 2, "R"
= 3. All other values are rendered NA
. Keep this in mind for further processing, especially if the model does not allow for NA
values.
These steps integrate with recipes::recipe()
and work like standard preprocessing steps. They are useful for preparing data for modelling, especially with classification models.
recipes::recipe()
, as.mic()
, as.sir()
library(tidymodels)
# The below approach formed the basis for this paper: DOI 10.3389/fmicb.2025.1582703
# Presence of ESBL genes was predicted based on raw MIC values.
# example data set in the AMR package
esbl_isolates
# Prepare a binary outcome and convert to ordered factor
data <- esbl_isolates %>%
mutate(esbl = factor(esbl, levels = c(FALSE, TRUE), ordered = TRUE))
# Split into training and testing sets
split <- initial_split(data)
training_data <- training(split)
testing_data <- testing(split)
# Create and prep a recipe with MIC log2 transformation
mic_recipe <- recipe(esbl ~ ., data = training_data) %>%
# Optionally remove non-predictive variables
remove_role(genus, old_role = "predictor") %>%
# Apply the log2 transformation to all MIC predictors
step_mic_log2(all_mic_predictors()) %>%
prep()
# View prepped recipe
mic_recipe
# Apply the recipe to training and testing data
out_training <- bake(mic_recipe, new_data = NULL)
out_testing <- bake(mic_recipe, new_data = testing_data)
# Fit a logistic regression model
fitted <- logistic_reg(mode = "classification") %>%
set_engine("glm") %>%
fit(esbl ~ ., data = out_training)
# Generate predictions on the test set
predictions <- predict(fitted, out_testing) %>%
bind_cols(out_testing)
# Evaluate predictions using standard classification metrics
our_metrics <- metric_set(accuracy, kap, ppv, npv)
metrics <- our_metrics(predictions, truth = esbl, estimate = .pred_class)
# Show performance:
# - negative predictive value (NPV) of ~98%
# - positive predictive value (PPV) of ~94%
metrics
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.