noiris: noiris: Data sets
In m-clark/noiris: Data sets

Description Details Author(s) See Also

The goal of this package is primarily to provide data that is more relevant to the kind people would more typically come across in the wild, or is simply more interesting (at least to me). Far too often examples use iris, mtcars, etc. for convenience, but these actually are inconvenient for demonstrating real data and modeling problems, or are too small to be very realistic examples of everyday data. This package will provide larger and at some point messier data, that is better named, better documented, and would be useful across a variety of modeling contexts.

Right now it has:

gapminder_2019: a 2019 pull from http://www.gapminder.org/data/
starwars: a variety of cleaned data sets produced using the rwars package.
instructor_evaluations: a nice-sized data set for mixed/multi-level modeling taken from the 'lme4' package.
pisa: OECD's Programme for International Student Assessment with international scores for math, science, and reading, covering years 2000-2015.
world_happiness: Multiyear data set with country level scores of 'happiness'. From 2019 World Happiness Report, and includes data from 2005-2018.
wine_reviews: Two data sets regarding wine reviews that can be used for a wide range of standard statistical and machine learning.
google_apps: Ratings and other information for Google Play Store apps.
fashion_train: The 'Fashion MNIST'. Image data for clothing items. Also fasion_test.
gender_gap: Country level data regarding the World Bank Gender Gap Index.
kiva: Lending information from kiva.org online crowdfunding platform.
water_risk: Country and province level data regarding water risk.
big_five: Big Five personality traits.
heart_disease: The UCI heart disease data.
retirement: Retirement participation rates.
movielens: 1 million movie ratings.