Description Usage Format Source References Examples
An artificial dataset, intended for presenting the key features of dataReporter
, which is a
toolset for identifying potential errors in a dataset.
1 |
A data.frame
with 15 rows and 6 variables.
A factor variable with two levels ("red"
and "blue"
) and a few
(correctly coded) missing observations. This represents the colour of a pill.
A numeric variable with one obvious outlier value (82
), two miscoded
missing values (999
and NaN
) and a few correctly coded missing values. The number of previous events.
A factor variable where two of the levels ("other"
and "OTHER"
are the same word with different case settings. Moreover, the variable includes a Stata-style
miscoded missing value ("."
). Used to represent geographical regions or treatment centers.
.
A numeric variable (random draws from a standard normal distribution). Representing a change in a measured variable.
A factor variable with unique codes for each observation (a character string with a number between 1 and 15), i.e. a key variable.
A factor variable that has the same level ("Irrelevant"
) for all
observations, i.e. a empty variable. The latest song played on Spotify.
Artificial data
Petersen AH, Ekstrøm CT (2019). “dataMaid: Your Assistant for Documenting Supervised Data Quality Screening in R.” _Journal of Statistical Software_, *90*(6), 1-38. doi: 10.18637/jss.v090.i06 ( doi: 10.18637/jss.v090.i06).
1 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.