knitr::opts_chunk$set(echo = TRUE)

\newcommand{\bX}{\mathbf{X}} \newcommand{\bZ}{\mathbf{Z}} \newcommand{\bx}{\mathbf{x}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bigf}{\mathbf{f}} \newcommand{\bo}{\mathbf{0}}

Generate data

We generate the high rank continuous data matrix $\bX\in \mathbb{R}^{500\times200}$ from a low rank Gaussian copula as in the experiments of our paper and randomly mask $40\%$ observation as test set.

  library(gcimputeR)
  set.seed(410)
  var_types = list('cont'=1:200)
  f = function(x)x^3
  X = generate_LRGC(rank = 10, sigma = 0.1,n = 500,  var_types = var_types, cont_transform = f)
  # mask 40% of the original observation
  X_obs = mask_MCAR(X, mask_fraction = 0.4)

Fitting low rank Gaussian copula

Simply specify the rank to the function call.

est = impute_LRGC(X_obs, rank = 10) # around 8 secs
print("Normalized root mean squared error (NRMSE) is: ")
print(round(cal_rmse(xhat = est$Ximp, xobs = X_obs, xtrue = X, relative = TRUE), 4))

Construct confidence interval

ct = ct_impute(X_obs, est, 0.95)
loc = is.na(X_obs)
print("The empirical coverage is: ")
print(mean(X[loc] >= ct$lower[loc] & X[loc] <= ct$upper[loc]))
print("The mean confidence interval length is: ")
mean(ct$upper[loc] - ct$lower[loc])


udellgroup/mixedgcImp documentation built on Jan. 25, 2023, 7:55 p.m.