| check | R Documentation |
Run a set of validation checks to check a variable vector or a full dataset for potential errors. Which checks are performed depends on the class of the variable and on user inputs.
check(v, nMax = 10, checks = setChecks(), ...)
v |
the vector or the dataset ( |
nMax |
If a check is supposed to identify problematic values,
this argument controls if all of these should be pasted onto the outputted
message, or if only the first |
checks |
A list of checks to use on each supported variable type. We recommend
using |
... |
Other arguments that are passed on to the checking functions.
These includes general parameters controlling how the check results are
formatted (e.g. |
It should be noted that the default options for each variable type
are returned by calling e.g. defaultCharacterChecks(),
defaultFactorChecks(), defaultNumericChecks(), etc. A complete
overview of all default options can be obtained by calling setChecks().
Moreover, all available checkFunctions (including both locally defined
functions and functions imported from dataMaid or other packages) can
be viewed by calling allCheckFunctions().
If v is a variable, a list of objects of class
checkResult, which each summarizes the result of a
checkFunction call performed on v.
See checkResult for more details. If V is a
data.frame, a list of lists of the form above
is returned instead with one entry for each variable in v.
Petersen AH, Ekstrøm CT (2019). “dataMaid: Your Assistant for Documenting Supervised Data Quality Screening in R.” _Journal of Statistical Software_, *90*(6), 1-38. doi: 10.18637/jss.v090.i06 ( \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v090.i06")}).
setChecks,
allCheckFunctions checkResult
checkFunction, defaultCharacterChecks,
defaultFactorChecks, defaultLabelledChecks,
defaultHavenlabelledChecks,
defaultNumericChecks, defaultIntegerChecks,
defaultLogicalChecks, defaultDateChecks
x <- 1:5
check(x)
#Annoyingly coded missing as 99
y <- c(rnorm(100), rep(99, 10))
check(y)
#Check y for outliers and print 4 decimals for problematic variables
check(y, checks = setChecks(numeric = "identifyOutliers"), maxDecimals = 4)
#Change what checks are performed on a variable, now only identifyMissing is called
# for numeric variables
check(y, checks = setChecks(numeric = "identifyMissing"))
#Check a full data.frame at once
data(cars)
check(cars)
#Check a full data.frame at once, while changing the standard settings for
#several data classes at once. Here, we ommit the check of miscoded missing values for factors
#and we only do this check for numeric variables:
check(cars, checks = setChecks(factor = defaultFactorChecks(remove = "identifyMissing"),
numeric = "identifyMissing"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.