inst/RecSysLinModels.md
In regtools: Regression and Classification Tools

Linear Models in Recommender Systems

N. Matloff, UC Davis

In the collaborative filtering approach to recommender systems modeling, a very simple but common model for the rating user i gives to item j is

Yij = μ + ui + vj + εij

where

μ is the overall mean rating over all users and items
ui is the propensity of user i to rate items liberally or harshly
vj is the propensity of item j to be rated liberally or harshly
εij is an error term, incorporating all other factors
taken as random variables as i and j vary through all users and items, ui, vj, and εij are independent with mean 0

The form of the above model suggests using linear model software, e.g.

library(dslabs)         
data(movielens)
ml <- movielens
ml <- ml[,c(5,1,6)]
ml$userId <- as.factor(ml$userId)
ml$movieId <- as.factor(ml$movieId)
lm(rating ~ .,data=ml)

At first glance, this seems like a questionable idea. In this version of the MovieLens data, there are 671 users and 9066 movies, thus nearly 10,000 dummy variables generated by lm(). With only 100,000 data points (and which are not independent), we run a real risk of overfitting. Worse, the code is quite long-running (over 2 hours in the run I tried on an ordinary PC).

But it turns out there is a simple, fast, closed-form solution, both for this model and for some more advanced versions featuring interaction terms.

Estimating μ is easy. From its definition, we take our estimate to be

Y.. = Σi Σj Yij / n

where is the total number of data points.

Write the above model in population form.

Y = μ + U + I + e

Now consider user i, taking expectation conditioned on U = i:

E(Y | U = i) = μ + ui

The natural estimate of the LHS is

Y.. = Σi Ni

where Ni is the number of items rated by user i.

Our estimate for ui is then

Yi. - Y..

A similar derivation yields our estimate for vj,

Y.j - Y..

(under construction)

Any scripts or data that you put into this service are public.

regtools documentation built on March 31, 2022, 1:06 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

regtools
Regression and Classification Tools

inst/RecSysLinModels.md
In regtools: Regression and Classification Tools

Linear Models in Recommender Systems

Overview

Analysis: Noniteractive model

Try the regtools package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

regtools Regression and Classification Tools

inst/RecSysLinModels.md In regtools: Regression and Classification Tools

Linear Models in Recommender Systems

Overview

Analysis: Noniteractive model

Try the regtools package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

regtools
Regression and Classification Tools

inst/RecSysLinModels.md
In regtools: Regression and Classification Tools