wine: Portuguese wine quality
In ekstroem/mommix: Moment-Based Estimation of Regression Mixtures

Description Format Details Source References Examples

The datasets contains red and white variants of the Portuguese "Vinho Verde" wine. These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are munch more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant.

A data frame with 6497 observations on the following 13 variables.

fixed.acidity: a numeric vector
volatile.acidity: a numeric vector
citric.acid: a numeric vector
residual.sugar: a numeric vector
chlorides: a numeric vector
free.sulfur.dioxide: a numeric vector
total.sulfur.dioxide: a numeric vector
density: a numeric vector
pH: a numeric vector
sulphates: a numeric vector
alcohol: a numeric vector
quality: a numeric vector
colour: a character vector

Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

Data was obtained from the UCI Machine Learning Repository, https://archive.ics.uci.edu/ml/datasets/wine+quality

P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.