factors_single: Calculate latent factors for a new user

View source: R/factors_single.R

factors_singleR Documentation

Calculate latent factors for a new user

Description

Determine latent factors for a new user, given either 'X' data (a.k.a. "warm-start"), or 'U' data (a.k.a. "cold-start"), or both.

For example usage, see the main section fit_models.

Usage

factors_single(model, ...)

## S3 method for class 'CMF'
factors_single(
  model,
  X = NULL,
  X_col = NULL,
  X_val = NULL,
  U = NULL,
  U_col = NULL,
  U_val = NULL,
  U_bin = NULL,
  weight = NULL,
  output_bias = FALSE,
  ...
)

## S3 method for class 'CMF_implicit'
factors_single(
  model,
  X = NULL,
  X_col = NULL,
  X_val = NULL,
  U = NULL,
  U_col = NULL,
  U_val = NULL,
  ...
)

## S3 method for class 'ContentBased'
factors_single(model, U = NULL, U_col = NULL, U_val = NULL, ...)

## S3 method for class 'OMF_explicit'
factors_single(
  model,
  X = NULL,
  X_col = NULL,
  X_val = NULL,
  U = NULL,
  U_col = NULL,
  U_val = NULL,
  weight = NULL,
  output_bias = FALSE,
  output_A = FALSE,
  exact = FALSE,
  ...
)

## S3 method for class 'OMF_implicit'
factors_single(
  model,
  X = NULL,
  X_col = NULL,
  X_val = NULL,
  U = NULL,
  U_col = NULL,
  U_val = NULL,
  output_A = FALSE,
  ...
)

Arguments

model

A collective matrix factorization model from this package - see fit_models for details.

...

Not used.

X

New 'X' data, either as a numeric vector (class 'numeric'), or as a sparse vector from package 'Matrix' (class 'dsparseVector'). If the 'X' to which the model was fit was a 'data.frame', the column/item indices will have been reindexed internally, and the numeration can be found under 'model$info$item_mapping'. Alternatively, can instead pass the column indices and values and let the model reindex them (see 'X_col' and 'X_val'). Should pass at most one of 'X' or 'X_col'+'X_val'. Dense 'X' data is not supported for 'CMF_implicit' or 'OMF_implicit'.

X_col

New 'X' data in sparse vector format, with 'X_col' denoting the items/columns which are not missing. If the 'X' to which the model was fit was a 'data.frame', here should pass IDs matching to the second column of that 'X', which will be reindexed internally. Otherwise, should have column indices with numeration starting at 1 (passed as an integer vector). Should pass at most one of 'X' or 'X_col'+'X_val'.

X_val

New 'X' data in sparse vector format, with 'X_val' denoting the associated values to each entry in 'X_col' (should be a numeric vector of the same length as 'X_col'). Should pass at most one of 'X' or 'X_col'+'X_val'.

U

New 'U' data, either as a numeric vector (class 'numeric'), or as a sparse vector from package 'Matrix' (class 'dsparseVector'). Alternatively, if 'U' is sparse, can instead pass the indices of the non-missing columns and their values separately (see 'U_col'). Should pass at most one of 'U' or 'U_col'+'U_val'.

U_col

New 'U' data in sparse vector format, with 'U_col' denoting the attributes/columns which are not missing. Should have numeration starting at 1 (should be an integer vector). Should pass at most one of 'U' or 'U_col'+'U_val'.

U_val

New 'U' data in sparse vector format, with 'U_val' denoting the associated values to each entry in 'U_col' (should be a numeric vector of the same length as 'U_col'). Should pass at most one of 'U' or 'U_col'+'U_val'.

U_bin

Binary columns of 'U' on which a sigmoid transformation will be applied. Should be passed as a numeric vector. Note that 'U' and 'U_bin' are not mutually exclusive.

weight

(Only for the explicit-feedback models) Associated weight to each non-missing observation in 'X'. Must have the same number of entries as 'X' - that is, if passing a dense vector of length 'n', 'weight' should be a numeric vector of length 'n' too, if passing a sparse vector, should have a length corresponding to the number of non-missing elements. or alternatively, may be a sparse matrix/vector with the same non-missing indices as 'X' (but this will not be checked).

output_bias

Whether to also return the user bias determined by the model given the data in 'X'.

output_A

Whether to return the raw 'A' factors (the free offset).

exact

(In the 'OMF_explicit' model) Whether to calculate 'A' and 'Am' with the regularization applied to 'A' instead of to 'Am' (if using the L-BFGS method, this is how the model was fit). This is usually a slower procedure. Only relevant when passing 'X' data.

Details

Note that, regardless of whether the model was fit with the L-BFGS or ALS method with CG or Cholesky solver, the new factors will be determined through the Cholesky method or through the precomputed matrices (e.g. a simple matrix-vector multiply for the 'ContentBased' model), unless passing 'U_bin' in which case they will be determined through the same L-BFGS method with which the model was fit.

Value

If passing 'output_bias=FALSE', 'output_A=FALSE', and in the implicit-feedback models, will return a vector with the obtained latent factors. If passing any of the earlier options, will return a list with the following entries:

  • 'factors', which will contain the obtained factors for this new user.

  • 'bias', which will contain the obtained bias for this new user (if passing 'output_bias=TRUE') (this will be a single number).

  • 'A' (if passing 'output_A=TRUE'), which will contain the raw 'A' vector (which is added to the factors determined from user attributes in order to obtain the factorization parameters).

See Also

factors topN_new


cmfrec documentation built on April 11, 2023, 6 p.m.