init_lv: Initialize Latent Variable

View source: R/init_latent.R

init_lvR Documentation

Initialize Latent Variable

Description

Initialize latent variable with GLRM or PCA.

Usage

init_lv(
  X,
  minprop = 0.2,
  terms,
  method = c("pca", "glrm"),
  k = 5,
  nRounds = 1,
  nMax = NULL,
  nRand = 1,
  h2o.init.args,
  h2o.glrm.args,
  seed = NULL,
  ...
)

Arguments

X

A matrix of observations that are 0, 1 or NA.

minprop

The proportion voting in the minority that serves as the cutoff. Votes with fewer than n times minprop votes in the minority will be removed from the vote matrix.

terms

A vector of values identifying the term in which the vote was taken.

method

A character - either "pca" or "glrm" identifying the default method for initializing the latent variable.

k

Scalar giving the number of dimensions to be estimated. Note, this does not necessarily have to be the same as the number of dimensions estimated in the final model.

nRounds

Number of rounds of GLRM. If this is 1, then all bills are fed into the GLRM. If ths is greater than 1, then nRounds/ncol(X) bills are fed into the GLRM each time and the results of the initialized latent variable are averaged over the different runs. The right value will depend on the computational resources at hand, but we suggest the smalles value such that ncol(X)/nRounds < 5000 as a good place to start. If set to NULL, the algorithm uses ceiling(ncol(X)/5000).

nMax

Maximum number of bills to take per term if nRand is bigger than 1.

nRand

Number of random samples from the input matrix to use in producing the output.

h2o.init.args

A list of arguments to be passed to h2o.init.

h2o.glrm.args

A list or arguments to be passed to h2o.glrm.

seed

Random number generator seed.

...

Other arguments to be passed down - currently unimplemented.

Details

This function initializes a static latent variable for each observation on a given number of dimensions. This can use a principal components analysis (PCA) model or a generalized low-rank model (GLRM). The latter requires you to have the h2o package installed.

Value

A list with the reduced set of votes, their corresponding terms and an n x k matrix of latent variable estimates.


davidaarmstrong/legR documentation built on Oct. 13, 2023, 1:08 p.m.