h2o.glrm: Generalized low rank decomposition of an H2O data frame
In h2o: R Interface for the 'H2O' Scalable Machine Learning Platform

h2o.glrm

R Documentation

Generalized low rank decomposition of an H2O data frame

Description

Builds a generalized low rank decomposition of an H2O data frame

Usage

h2o.glrm(
  training_frame,
  cols = NULL,
  model_id = NULL,
  validation_frame = NULL,
  ignore_const_cols = TRUE,
  score_each_iteration = FALSE,
  representation_name = NULL,
  loading_name = NULL,
  transform = c("NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE"),
  k = 1,
  loss = c("Quadratic", "Absolute", "Huber", "Poisson", "Hinge", "Logistic", "Periodic"),
  loss_by_col = c("Quadratic", "Absolute", "Huber", "Poisson", "Hinge", "Logistic",
    "Periodic", "Categorical", "Ordinal"),
  loss_by_col_idx = NULL,
  multi_loss = c("Categorical", "Ordinal"),
  period = 1,
  regularization_x = c("None", "Quadratic", "L2", "L1", "NonNegative", "OneSparse",
    "UnitOneSparse", "Simplex"),
  regularization_y = c("None", "Quadratic", "L2", "L1", "NonNegative", "OneSparse",
    "UnitOneSparse", "Simplex"),
  gamma_x = 0,
  gamma_y = 0,
  max_iterations = 1000,
  max_updates = 2000,
  init_step_size = 1,
  min_step_size = 1e-04,
  seed = -1,
  init = c("Random", "SVD", "PlusPlus", "User"),
  svd_method = c("GramSVD", "Power", "Randomized"),
  user_y = NULL,
  user_x = NULL,
  expand_user_y = TRUE,
  impute_original = FALSE,
  recover_svd = FALSE,
  max_runtime_secs = 0,
  export_checkpoints_dir = NULL
)

Arguments

`training_frame`	Id of the training data frame.
`cols`	(Optional) A vector containing the data columns on which k-means operates.
`model_id`	Destination id for this model; auto-generated if not specified.
`validation_frame`	Id of the validation data frame.
`ignore_const_cols`	`Logical`. Ignore constant columns. Defaults to TRUE.
`score_each_iteration`	`Logical`. Whether to score during each iteration of model training. Defaults to FALSE.
`representation_name`	Frame key to save resulting X
`loading_name`	[Deprecated] Use representation_name instead. Frame key to save resulting X.
`transform`	Transformation of training data Must be one of: "NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE". Defaults to NONE.
`k`	Rank of matrix approximation Defaults to 1.
`loss`	Numeric loss function Must be one of: "Quadratic", "Absolute", "Huber", "Poisson", "Hinge", "Logistic", "Periodic". Defaults to Quadratic.
`loss_by_col`	Loss function by column (override) Must be one of: "Quadratic", "Absolute", "Huber", "Poisson", "Hinge", "Logistic", "Periodic", "Categorical", "Ordinal".
`loss_by_col_idx`	Loss function by column index (override)
`multi_loss`	Categorical loss function Must be one of: "Categorical", "Ordinal". Defaults to Categorical.
`period`	Length of period (only used with periodic loss function) Defaults to 1.
`regularization_x`	Regularization function for X matrix Must be one of: "None", "Quadratic", "L2", "L1", "NonNegative", "OneSparse", "UnitOneSparse", "Simplex". Defaults to None.
`regularization_y`	Regularization function for Y matrix Must be one of: "None", "Quadratic", "L2", "L1", "NonNegative", "OneSparse", "UnitOneSparse", "Simplex". Defaults to None.
`gamma_x`	Regularization weight on X matrix Defaults to 0.
`gamma_y`	Regularization weight on Y matrix Defaults to 0.
`max_iterations`	Maximum number of iterations Defaults to 1000.
`max_updates`	Maximum number of updates, defaults to 2*max_iterations Defaults to 2000.
`init_step_size`	Initial step size Defaults to 1.
`min_step_size`	Minimum step size Defaults to 0.0001.
`seed`	Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default). Defaults to -1 (time-based random number).
`init`	Initialization mode Must be one of: "Random", "SVD", "PlusPlus", "User". Defaults to PlusPlus.
`svd_method`	Method for computing SVD during initialization (Caution: Randomized is currently experimental and unstable) Must be one of: "GramSVD", "Power", "Randomized". Defaults to Randomized.
`user_y`	User-specified initial Y
`user_x`	User-specified initial X
`expand_user_y`	`Logical`. Expand categorical columns in user-specified initial Y Defaults to TRUE.
`impute_original`	`Logical`. Reconstruct original training data by reversing transform Defaults to FALSE.
`recover_svd`	`Logical`. Recover singular values and eigenvectors of XY Defaults to FALSE.
`max_runtime_secs`	Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0.
`export_checkpoints_dir`	Automatically export generated models to this directory.

Value

an object of class H2ODimReductionModel.

References

M. Udell, C. Horn, R. Zadeh, S. Boyd (2014). Generalized Low Rank Models[https://arxiv.org/abs/1410.0342]. Unpublished manuscript, Stanford Electrical Engineering Department. N. Halko, P.G. Martinsson, J.A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions[https://arxiv.org/abs/0909.4061]. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011.

Examples

## Not run: 
library(h2o)
h2o.init()
australia_path <- system.file("extdata", "australia.csv", package = "h2o")
australia <- h2o.uploadFile(path = australia_path)
h2o.glrm(training_frame = australia, k = 5, loss = "Quadratic", regularization_x = "L1",
         gamma_x = 0.5, gamma_y = 0, max_iterations = 1000)

## End(Not run)

h2o documentation built on May 29, 2024, 4:26 a.m.