| predict_guard | R Documentation |
Applies the preprocessing steps stored in a GuardFit object to new
data without refitting any statistics. This is designed to prevent
validation leakage that would occur if imputation, scaling, filtering, or
feature selection were recomputed on evaluation data. It enforces the
training schema by aligning columns and factor levels, and it errors when a
numeric-only training fit receives non-numeric predictors. It does not
detect label leakage, duplicate samples, or train/test contamination.
predict_guard(fit, newdata)
fit |
A |
newdata |
A matrix or data.frame of predictors with one row per sample.
This required argument (no default) is transformed using the training-time
parameters in |
A data.frame of transformed predictors with the same number of rows
as newdata. Column order and content match the training pipeline and
may include derived features (one-hot encodings, missingness indicators, or
PCA components). This output is not a prediction; it is intended as input to
a downstream model and assumes the training-time preprocessing is valid for
the new data.
x_train <- data.frame(a = c(1, 2, NA, 4), b = c(10, 11, 12, 13))
fit <- .guard_fit(
x_train,
y = c(0.1, 0.2, 0.3, 0.4),
steps = list(impute = list(method = "median")),
task = "gaussian"
)
x_new <- data.frame(a = c(NA, 5), b = c(9, 14))
out <- predict_guard(fit, x_new)
out
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.