NIMIWAE: Process results: return imputation metrics

View source: R/NIMIWAE.R

NIMIWAER Documentation

Process results: return imputation metrics

Description

Process results: return imputation metrics

Usage

NIMIWAE(
  data,
  dataset,
  data_types,
  Missing,
  g = NULL,
  rdeponz = F,
  learn_r = T,
  phi0 = NULL,
  phi = NULL,
  ignorable = F,
  covars_r = rep(1, ncol(data)),
  arch = "IWAE",
  draw_xmiss = T,
  hyperparameters = list(sigma = "elu", h = c(64L), n_hidden_layers = c(1L, 2L),
    n_hidden_layers_r0 = c(0L, 1L), bs = c(1000L), lr = c(0.001, 0.01), dim_z =
    as.integer(c(floor(ncol(data)/2), floor(ncol(data)/4))), niw = 5L, n_imputations =
    5L, n_epochs = 2002L),
  save_imps = F,
  dir_name = ".",
  normalize = T
)

Arguments

data

Data matrix (N x P)

data_types

Vector of data types ("real", "count", "pos", "cat")

Missing

Missingness mask matrix (N x P)

g

Training-validation-test split partitioning

rdeponz

TRUE/FALSE: Whether to allow missingness (r) to depend on the latent variable (z). Default is FALSE

learn_r

TRUE/FALSE: Whether to learn missingness model via appended NN (TRUE, default), or fit a known logistic regression model (FALSE). If FALSE, 'phi0' and 'phi' must be specified

phi0

(optional) Intercept of logistic regression model, if learn_r = FALSE.

phi

(optional) Vector of coefficients of logistic regression model for each input covariates 'covars_r', if learn_r = FALSE. 'phi' must be the same length as the number of input covariates, or 'sum(covars_r)'.

ignorable

TRUE/FALSE: Whether missingness is ignorable (MCAR/MAR) or nonignorable (MNAR, default). If missingness is known to be ignorable, "ignorable=T" omits missingness model.

covars_r

Vector of 1's and 0's of whether each feature is included as covariates in the missingness model. Need not be specified if 'ignorable = T'. Default is using all features as covariates in missingness model. Must be length P (or 'ncol(data)')

arch

Architecture of NIMIWAE. Can be "IWAE" or "VAE". "VAE" is specific case of the "IWAE" where only one sample is drawn from the joint posterior of (z, xm).

hyperparameters

List of grid of hyperparameter values to search. Relevant hyperparameters: 'sigma': activation function ("relu" or "elu"), 'h': number of nodes per hidden layer, 'n_hidden_layers': #hidden layers (except missingness model Decoder_r), 'n_hidden_layers_r': #hidden layers in missingness model (Decoder_r). If "NULL" then set as the same value as each n_hidden_layers (not tuned). Otherwise, can tune a different grid of values; 'bs': batch size, 'lr': learning rate, 'dim_z': dimensionality of latent z, 'niw': number of importance weights (samples drawn from each latent space), 'n_imputations', 'n_epochs': maximum number of epochs

Value

res object: NIMIWAE fit containing ... on the test set

Author(s)

David K. Lim, deelim@live.unc.edu

References

https://github.com/DavidKLim/NIMIWAE

Examples

fit_data = read_data("CONCRETE"); data = fit_data$data
# fit_data = simulate_data(N=100000, D=1, P=2, sim_index=1)   # optionally: simulate data with 100K obs, 1 latent dim, 2 features
set.seed(111); ref_cols=sample(c(1:ncol(data)),ceiling(ncol(data)/2),replace=F); miss_cols=(1:ncol(data))[-ref_cols]
set.seed(222); phis=rlnorm(ncol(data),log(5),0.2)
fit_Missing = simulate_missing(data, miss_cols, ref_cols, pi=0.5, phis, NULL, "UV", "MNAR")
data=fit_data$data; Missing=fit_Missing$Missing; g=fit_data$g
res=NIMIWAE(data, Missing, g)    # using default hyperparameters grid
imp_metrics = processResults(data=data, Missing=Missing, g=g, res=res)


DavidKLim/NIMIWAE documentation built on Jan. 19, 2024, 11:18 p.m.