ferment: Ferment a brew
In bcjaeger/ipa: Imputation for Predictive Analytics

Description Usage Arguments Note Examples

Missing values can occur in training data and testing data.

Unfortunately, some imputation strategies are only designed to impute missing training data. For example, softImpute imputes missing values based on the index of the missing value in the training data, and this doesn't generalize to testing data because testing data (by definition) do not have indices in the training data.

ferment generally adheres to the principle of using only training data to impute missing testing data, except when it can't (i.e., when flavor = 'softImpute').

ferment automatically copies the data-processing and imputation arguments used in previous brewing steps. Specifically, if brew was called with bind_miss = TRUE, then the missing value indicator matrix for data_new will be bound to data_new and used in the imputation procedure. Additionally, imputation parameters specified in the spice and mash steps will automatically be implemented in the ferment step.

1	ferment(brew, data_new = NULL, timer = FALSE)

`brew`	an `ipa_brew` object.
`data_new`	a data frame with missing values.
`timer`	a logical value. If `TRUE`, then the amount of time it takes to fit the imputation models will be tracked and saved as an attribute of the resulting `ipa_brew` object.

What is a wort? A component of a brew object that contains imputed datasets, models used to impute those datasets, and the corresponding hyper-parameters of those models.

x1 = rnorm(100)
x2 = rnorm(100) + x1
x3 = rnorm(100) + x1 + x2

outcome = 0.5 * (x1 - x2 + x3)

n_miss = 10
x1[1:n_miss] <- NA

data <- data.frame(x1=x1, x2=x2, x3=x3, outcome=outcome)

sft_brew <- brew_soft(data, outcome=outcome, bind_miss = FALSE)
sft_brew <- mash(sft_brew, with = masher_soft(bs = TRUE))
sft_brew <- stir(sft_brew, timer = TRUE)

ferment(sft_brew)

data_new = data.frame(
  x1      = c(1/2, NA_real_),
  x2      = c(NA_real_, 2/3),
  x3      = c(5/2, 2/3),
  outcome = c(1/3, 2/3)
)

# soft models are re-fitted after stacking data_new with data_ref

ferment(sft_brew, data_new = data_new)