movielens: MovieLens data.
In m-clark/noiris: Data sets

Description Usage Format Details Source Examples

A sample of the recent MovieLens data

movielens

A data frame of 1 million rows and 27 columns regarding 184164 users and 21745 distinct movies.

user_id: Numeric user id
movie_id: Numeric movie id
rating: Numeric favorability rating from .5 to 5 by .5.
timestamp: Time the rating was given
title: A clean version of the title. See details.
year: The year of the title
genres: A list column whose elements are character vectors of genres applicable to the title
action: The genres as a binary indicator
adventure: The genres as a binary indicator
drama: The genres as a binary indicator
scifi: The genres as a binary indicator
thriller: The genres as a binary indicator
crime: The genres as a binary indicator
romance: The genres as a binary indicator
animation: The genres as a binary indicator
children: The genres as a binary indicator
comedy: The genres as a binary indicator
fantasy: The genres as a binary indicator
imax: The genres as a binary indicator
horror: The genres as a binary indicator
western: The genres as a binary indicator
war: The genres as a binary indicator
mystery: The genres as a binary indicator
musical: The genres as a binary indicator
film_noir: The genres as a binary indicator
documentary: The genres as a binary indicator
no_genre: The genres as a binary indicator

This is a random sample of 1 million observations from the MovieLens data (not the 1 million data set version listed at the ), often used as a machine learning benchmark. Differences from the raw data include separation of title and year, and adding the genres as indicator variables. See the associated ReadMe file for more details.

https://grouplens.org/datasets/movielens/