confounder_sensitivity: Confounder sensitivity summaries

View source: R/diagnostics_extra.R

confounder_sensitivityR Documentation

Confounder sensitivity summaries

Description

Computes performance metrics within confounder strata to surface potential confounding. Requires aligned metadata in 'coldata'.

Usage

confounder_sensitivity(
  fit,
  confounders = NULL,
  metric = NULL,
  min_n = 10,
  coldata = NULL,
  numeric_bins = 4,
  learner = NULL
)

Arguments

fit

A [LeakFit] object from [fit_resample()].

confounders

Character vector of columns in 'coldata' to evaluate. Defaults to common batch/study identifiers when available.

metric

Metric name to compute within each stratum. Defaults to the first metric used in the fit (or task defaults if unavailable).

min_n

Minimum samples per stratum; smaller strata return NA metrics.

coldata

Optional data.frame of sample metadata. Defaults to 'fit@splits@info$coldata' when available.

numeric_bins

Integer number of quantile bins for numeric confounders with many unique values.

learner

Optional character scalar. When predictions include multiple learners, selects the learner to summarize.

Value

A data.frame with per-confounder, per-level metrics and counts.

Examples

set.seed(42)
df <- data.frame(
  subject = rep(1:15, each = 2),
  outcome = factor(rep(c(0, 1), 15)),
  batch = factor(rep(c("A", "B", "C"), 10)),
  x1 = rnorm(30),
  x2 = rnorm(30)
)
splits <- make_split_plan(df, outcome = "outcome",
                          mode = "subject_grouped", group = "subject",
                          v = 3, progress = FALSE)
custom <- list(
  glm = list(
    fit = function(x, y, task, weights, ...) {
      stats::glm(y ~ ., data = as.data.frame(x),
                 family = stats::binomial(), weights = weights)
    },
    predict = function(object, newdata, task, ...) {
      as.numeric(stats::predict(object, newdata = as.data.frame(newdata),
                                type = "response"))
    }
  )
)
fit <- fit_resample(df, outcome = "outcome", splits = splits,
                    learner = "glm", custom_learners = custom,
                    metrics = "auc", refit = FALSE, seed = 1)
confounder_sensitivity(fit, confounders = "batch", coldata = df)


bioLeak documentation built on March 6, 2026, 1:06 a.m.