fit_recommender_model: Fit a Latent Factor Recommender Model via Alternating Least...
In dslabs: Data Science Labs

fit_recommender_model

R Documentation

Fit a Latent Factor Recommender Model via Alternating Least Squares

Description

This function fits a penalized latent factor model for recommendation systems using an alternating least squares (ALS) algorithm. It estimates user effects, item effects, and latent factors simultaneously, with regularization to prevent overfitting. The implementation supports filtering out items with too few ratings.

Usage

fit_recommender_model(
  rating,
  user_id,
  item_id,
  K = 8,
  lambda_1 = 5e-05,
  lambda_2 = 1e-04,
  min_ratings = 20,
  maxit = 500,
  reltol = 1e-08,
  damping = 0.75,
  verbose = FALSE
)

Arguments

`rating`	A numeric vector of observed ratings.
`user_id`	A character vector identifying the user for each rating. Must be the same length as 'rating'.
`item_id`	A character vector identifying the item for each rating. Must be the same length as 'rating'.
`K`	Integer. The number of latent factors to estimate.
`lambda_1`	Numeric. Regularization parameter for user and item effects.
`lambda_2`	Numeric. Regularization parameter for latent factors.
`min_ratings`	Integer. Minimum number of ratings required for an item to be included in the estimation of latent factors.
`maxit`	Integer. Maximum number of iterations.
`reltol`	Numeric. Relative reltolerance for convergence, based on change in the objective function.
`damping`	Numeric between 0 and 1. Damping factor used to blend updates with the previous iteration for convergence stability.
`verbose`	Logical. If 'TRUE', prints progress messages during optimization.

Details

The model being fit is:

Y_{ij} = \mu + \alpha_i + \beta_j + \sum_{k=1}^K p_{ik} q_{jk} + \varepsilon_{ij}

where \mu is the global mean, \alpha_i are user effects, \beta_j are item effects, and the p_{ik}, q_{jk} are latent factors for users and items respectively. The estimation minimizes a penalized least squares criterion with separate penalties for user/item effects and latent factors.

Items with less than min_ratings observations are excluded from the estimation of 'p' and 'q'.

Value

A list with the following components:

`mu`	Global mean rating.
`a`	Named numeric vector of user-specific effects.
`b`	Named numeric vector of item-specific effects.
`p`	Matrix of user latent factors, with one row per user. The rownames of this matrix match the names of 'a'.
`q`	Matrix of item latent factors, with one row per item. The rownames of this matrix match the names of 'b'.
`min_ratings`	The threshold value used to filter items: only items with at least this many ratings are included in the estimation of latent factors.
`n_item`	Named integer vector of number of ratings per item.
`n_user`	Named integer vector of number of retained ratings per user.
`fitted`	Fitted values.

Examples

set.seed(2010)
## Simulation settings
n_users <- 200       # number of users
n_items <- 300       # number of items
K_true  <- 4         # true number of latent factors
sparsity <- 0.25     # ~5% of user-item pairs are observed

## True parameters
mu <- 3.5
a_true <- rnorm(n_users, 0, 0.3)     # user effects
b_true <- rnorm(n_items, 0, 0.4)     # item effects
p_true <- matrix(rnorm(n_users * K_true, 0, 0.5), n_users, K_true)
q_true <- matrix(rnorm(n_items * K_true, 0, 0.5), n_items, K_true)

names(a_true) <- 1:n_users
names(b_true) <- 1:n_items
rownames(p_true) <- 1:n_users
rownames(q_true) <- 1:n_items
## Generate observed ratings matrix with sparsity
user_id <- rep(as.character(1:n_users), each = n_items)
item_id <- rep(as.character(1:n_items), times = n_users)


## Which entries are observed?
obs <- runif(length(user_id)) < sparsity

## Ratings with noise
rating_full <- mu + a_true[user_id] + b_true[item_id] +
  rowSums(p_true[user_id, ] * q_true[item_id, ]) +
  rnorm(length(user_id), 0, 0.25)

rating <- rating_full[obs]
user_id <- user_id[obs]
item_id <- item_id[obs]

## Call your recommender function
fit <- fit_recommender_model(rating, user_id, item_id, K = 4, reltol = 1e-5,
                             min_ratings = 5, verbose = TRUE)
plot(fit$fitted, rating)

dslabs documentation built on Nov. 5, 2025, 6:07 p.m.