View source: R/fit_recommender_model.R
| fit_recommender_model | R Documentation |
This function fits a penalized latent factor model for recommendation systems using an alternating least squares (ALS) algorithm. It estimates user effects, item effects, and latent factors simultaneously, with regularization to prevent overfitting. The implementation supports filtering out items with too few ratings.
fit_recommender_model(
rating,
user_id,
item_id,
K = 8,
lambda_1 = 5e-05,
lambda_2 = 1e-04,
min_ratings = 20,
maxit = 500,
reltol = 1e-08,
damping = 0.75,
verbose = FALSE
)
rating |
A numeric vector of observed ratings. |
user_id |
A character vector identifying the user for each rating. Must be the same length as 'rating'. |
item_id |
A character vector identifying the item for each rating. Must be the same length as 'rating'. |
K |
Integer. The number of latent factors to estimate. |
lambda_1 |
Numeric. Regularization parameter for user and item effects. |
lambda_2 |
Numeric. Regularization parameter for latent factors. |
min_ratings |
Integer. Minimum number of ratings required for an item to be included in the estimation of latent factors. |
maxit |
Integer. Maximum number of iterations. |
reltol |
Numeric. Relative reltolerance for convergence, based on change in the objective function. |
damping |
Numeric between 0 and 1. Damping factor used to blend updates with the previous iteration for convergence stability. |
verbose |
Logical. If 'TRUE', prints progress messages during optimization. |
The model being fit is:
Y_{ij} = \mu + \alpha_i + \beta_j + \sum_{k=1}^K p_{ik} q_{jk} + \varepsilon_{ij}
where \mu is the global mean, \alpha_i are user effects,
\beta_j are item effects, and the p_{ik}, q_{jk} are latent
factors for users and items respectively. The estimation minimizes a penalized
least squares criterion with separate penalties for user/item effects and
latent factors.
Items with less than min_ratings observations are excluded from the estimation of 'p' and 'q'.
A list with the following components:
mu |
Global mean rating. |
a |
Named numeric vector of user-specific effects. |
b |
Named numeric vector of item-specific effects. |
p |
Matrix of user latent factors, with one row per user. The rownames of this matrix match the names of 'a'. |
q |
Matrix of item latent factors, with one row per item. The rownames of this matrix match the names of 'b'. |
min_ratings |
The threshold value used to filter items: only items with at least this many ratings are included in the estimation of latent factors. |
n_item |
Named integer vector of number of ratings per item. |
n_user |
Named integer vector of number of retained ratings per user. |
fitted |
Fitted values. |
set.seed(2010)
## Simulation settings
n_users <- 200 # number of users
n_items <- 300 # number of items
K_true <- 4 # true number of latent factors
sparsity <- 0.25 # ~5% of user-item pairs are observed
## True parameters
mu <- 3.5
a_true <- rnorm(n_users, 0, 0.3) # user effects
b_true <- rnorm(n_items, 0, 0.4) # item effects
p_true <- matrix(rnorm(n_users * K_true, 0, 0.5), n_users, K_true)
q_true <- matrix(rnorm(n_items * K_true, 0, 0.5), n_items, K_true)
names(a_true) <- 1:n_users
names(b_true) <- 1:n_items
rownames(p_true) <- 1:n_users
rownames(q_true) <- 1:n_items
## Generate observed ratings matrix with sparsity
user_id <- rep(as.character(1:n_users), each = n_items)
item_id <- rep(as.character(1:n_items), times = n_users)
## Which entries are observed?
obs <- runif(length(user_id)) < sparsity
## Ratings with noise
rating_full <- mu + a_true[user_id] + b_true[item_id] +
rowSums(p_true[user_id, ] * q_true[item_id, ]) +
rnorm(length(user_id), 0, 0.25)
rating <- rating_full[obs]
user_id <- user_id[obs]
item_id <- item_id[obs]
## Call your recommender function
fit <- fit_recommender_model(rating, user_id, item_id, K = 4, reltol = 1e-5,
min_ratings = 5, verbose = TRUE)
plot(fit$fitted, rating)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.