whisky_collection_corrupted: Dataset with a corrupted version of the whisky_Collection...

whisky_collection_corruptedR Documentation

Dataset with a corrupted version of the whisky_Collection dataset for educational issues

Description

This dataset is a corrupted version of the whisky_collection dataset of this package. It contains the most common data quality problems like missing data or errors, the columns and structures are not unified, there are outliers or values that will break your prediction model and other pitfalls for educational issues.

Usage

data(whisky_collection_corrupted)

Format

A data.frame with 42 rows and 15 variables:

NAME

Name of the whisky

distillery

Distiller of the specific whisky

LOCATION

Production location of the whisky (mostly countries or regions)

TYPE

Specification of the whisky type like e.g. single malt or blended

REGION

Region of the whisky production (mostly relevant for scotchs)

FOUNDATION

Year of the first whisky production

COORDINATES

Latitude and longitude values of the distillery

WIKIPEDIA

Link to the related article of the English Wikipedia

RATING

My personal rating of this whisky. I am open to discuss it, just write me an email if you see it otherwise ;-)

REVIEWS

The average rating of this whisky based on consumer reviews from many whisky online shops in 2023

CRITIQUES

The average rating of this whisky based on reviews from professional critics until 2023

SMOKENESS

My measure of how smoky vs. delicate it tastes, negative values implicate delicate

RICHNESS

My measure of how rich vs. light it tastes, negative values implicate light

PRICE

The average price level in Euro of the youngest 10/12 year or consumer version in the whiskyexchange 2023

origin

Alternative feature to LOCATION, contains location of the whisky (mostly countries or regions)

Examples

data(whisky_collection)

library("ggplot2")
ggplot(whisky_collection, aes(x=NAME, y=RATING, fill=RATING)) +
geom_bar(stat="identity") +
coord_flip() +
xlab("")


dominikjung42/dstools documentation built on June 16, 2024, 2:40 a.m.