toyData: Small example data to show the features of dataReporter

toyDataR Documentation

Small example data to show the features of dataReporter

Description

An artificial dataset, intended for presenting the key features of dataReporter, which is a toolset for identifying potential errors in a dataset.

Usage

toyData

Format

A data.frame with 15 rows and 6 variables.

pill

A factor variable with two levels ("red" and "blue") and a few (correctly coded) missing observations. This represents the colour of a pill.

events

A numeric variable with one obvious outlier value (82), two miscoded missing values (999 and NaN) and a few correctly coded missing values. The number of previous events.

region

A factor variable where two of the levels ("other" and "OTHER" are the same word with different case settings. Moreover, the variable includes a Stata-style miscoded missing value ("."). Used to represent geographical regions or treatment centers.

.

change

A numeric variable (random draws from a standard normal distribution). Representing a change in a measured variable.

id

A factor variable with unique codes for each observation (a character string with a number between 1 and 15), i.e. a key variable.

spotifysong

A factor variable that has the same level ("Irrelevant") for all observations, i.e. a empty variable. The latest song played on Spotify.

Source

Artificial data

References

Petersen AH, Ekstrøm CT (2019). “dataMaid: Your Assistant for Documenting Supervised Data Quality Screening in R.” _Journal of Statistical Software_, *90*(6), 1-38. doi: 10.18637/jss.v090.i06 ( \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v090.i06")}).

Examples

data(toyData)


dataReporter documentation built on April 14, 2025, 1:09 a.m.