factors: Determine latent factors for new rows/users

factorsR Documentation

Determine latent factors for new rows/users

Description

Determines the latent factors for new users (rows) given their counts for existing items (columns).

This function will use the same method and hyperparameters with which the model was fit. If using this for recommender systems, it's recommended to use instead the function factors.single as it's likely to be more precise.

Note that, when using “method='pg'“ (not recommended), results from this function and from 'get.factor.matrices' on the same data might differ a lot.

Usage

factors(model, X, add_names = TRUE)

Arguments

model

A Poisson factorization model as returned by 'poismf'.

X

New data for whose rows to determine latent factors. Can be passed as a 'data.frame' or as a sparse or dense matrix (see documentation of poismf for details on the data type). While other functions only accept sparse matrices in COO (triplets) format, this function will also take CSR matrices from the 'SparseM' and 'Matrix' packages (classes 'dgRMatrix'/'RsparseMatrix' for 'Matrix'). Inputs will be converted to CSR regardless of their original format.

Note that converting a matrix to 'dgRMatrix' format might require using 'as(m, "RsparseMatrix")' instead of using 'dgRMatrix' directly.

If passing a 'data.frame', the first column should contain row indices or IDs, and these will be internally remapped - the mapping will be available as the row names for the matrix if passing 'add_names=TRUE', or as part of the outputs if passing 'add_names=FALSE'. The IDs passed in the first column will not be matched to the existing IDs of 'X' passed to 'poismf'.

If 'X' passed to 'poismf' was a 'data.frame', 'X' here must also be passed as 'data.frame'. If 'X' passed to 'poismf' was a matrix and 'X' is a 'data.frame', the second column of 'X' here should contain column numbers (with numeration starting at 1).

add_names

Whether to add row names to the output matrix if the indices were internally remapped - they will only be so if the 'X' here is a 'data.frame'. Note that if the indices in passed in 'X' here (first and second columns) are integers, once row names are added, subsetting 'X' by an integer will give the row at that position - that is, if you want to obtain the corresponding row for ID=2 from 'X' in 'A_out', you need to use 'A_out["2", ]', not 'A_out[2, ]'.

Details

The factors are initialized to the mean of each column in the fitted model.

Value

  • If 'X' was passed as a matrix, will output a matrix of dimensions (n, k) with the obtained factors. If passing 'add_names=TRUE' and 'X' passed to 'poismf' was a 'data.frame', this matrix will have row names. Careful with subsetting with integers (see documentation for 'add_names').

  • If 'X' was passed as a 'data.frame' and passing 'add_names=FALSE' here, will output a list with an entry 'factors' containing the latent factors as described above, and an entry 'mapping' indicating to which row ID does each row of the output correspond.

See Also

factors.single topN.new


poismf documentation built on March 18, 2022, 6:19 p.m.

Related to factors in poismf...