mcnnm_wc_cv: This function computes the best model fitted to the data....

Description Usage Arguments Value See Also

View source: R/RcppExports.R

Description

This function computes the best model fitted to the data. Best values of lambda_L and lambda_H are chosen via cross-validation using all observed entries. It creates some folds, divides the observed entry to training and validation on each fold, computes the best model on training sets and finds root mean squared error on validation sets. Finally, it chooses the model which gives the smallest average RMSE.

Usage

1
2
3
4
mcnnm_wc_cv(M, X, Z, mask, to_normalize = 1L, to_estimate_u = 1L,
  to_estimate_v = 1L, to_add_ID = 1L, num_lam_L = 30L, num_lam_H = 30L,
  niter = 100L, rel_tol = 1e-05, cv_ratio = 0.8, num_folds = 1L,
  is_quiet = 1L)

Arguments

M

Matrix of observed entries. The input should be N (number of units) by T (number of time periods).

X

Matrix containing unit-related covariates. The number of rows of X should match with the number of units (number of rows of M). If unit-related covariates do not exist X = matrix(0L,0,0) should be used as input.

Z

Matrix containing time-related covariates. The number of rows of Z should match with the number of time periods (number of columns in M). If time-related covariates do not exist use Z = matrix(0L,0,0)

mask

Binary mask with the same shape as M containing observed entries.

to_normalize

Optional boolean parameter indicating whether to normalize covariates or not (columns of X and Z). The default value is 1. If this value is set to 0, the result would be sensitive to scales in covariates.

to_estimate_u

Optional boolean input for wheter estimating fixed unit effects (row means of M) or not. Default is 1.

to_add_ID

Optional boolean parameter indicating whether identity matrices are concatenated with X and Z in the model X * H * Z'. The default value is true (identity matrices are concatenated) and the model becomes X*H_X + X*H_XZ*Z^T+ H_Z Z^T (the rest of matrix in H forced to zero).

num_lam_L

Optional parameter on the number of lambda_Ls to consider for learning. The default number is 30 and lambda_L values are from minimum number which makes L zero to 1e-3 times this minimum number.

num_lam_H

Optional parameter on the number of lambda_Hs to consider for learning. The default number is 30 and lambda_H values are from minimum number which makes H zero to 1e-3 times this minimum number.

niter

Optional parameter on the number of iterations taken in the algorithm for each fixed value of lambda_L. The default value is 1000 and it is sufficiently large as the algorithm is using warm-start strategy.

rel_tol

Optional parameter on the stopping rule. Once the relative improve in objective value drops below rel_tol, execution is halted. Default value is 1e-5.

cv_ratio

Optional parameter indicating what percentage of observed entries to be used for training. 1-cv_ratio will be dedicated to validation set. For each fold these two sets are chosen randomly. Default value is 80/20 for training/validation.

num_folds

Optional parameter indicating the number of cross-validation folds. Default value is 3. For larger size problems we recommend decreasing this number for a faster cross-validation.

is_quiet

Optional boolean input which indicates whether to print the status of learning and convergence results for Cyclic Coordinate Descent algorithm or not. The default value is 1 (no output is printed).

Value

The best model fitted using lambda_L and lambda_H chosen via cross-validation using all observed entries (not only training set). The output also includes the matrix of average root mean squared error for different values of lambda_L and lambda_H. examples mcnnm_wc_cv(M = replicate(5,rnorm(5)), X = replicate(3, rnorm(5)), Z = matrix(0L, 0, 0), mask = matrix(rbinom(5*5,1,0.8),5,5))

See Also

mcnnm_cv


susanathey/MCPanel documentation built on May 29, 2019, 9:51 a.m.