data_quality: Performs a quality audit of a table

Description Usage Arguments Details Value Examples

View source: R/data_quality.R

Description

This function performs a quality check on a table. The number of missing values by variable along with the quantiles for the numeric variables and a frequency table for each categorical variable can be found in the result.

Usage

1
2
3
4
5
6
7
data_quality(
  data,
  numeric_cutoff = -1,
  na_type = NULL,
  max_length = Inf,
  global_only = FALSE
)

Arguments

data

a data.frame.

numeric_cutoff

the minimum number of distinct values required for a numeric vector not to be coerced to a fator. -1 is the default, meaning no minimum required.

na_type

charcater vector with valus that should be considered NA. Default to NULL, no values other than regular NA are treated as NA.

max_length

the maximum number of rows in the frequency tables

global_only

logical, whether to return only the global summary

Details

The types are defined based on the types in the input table and on the value of other arguments. 'numeric_cutoff' allows numeric variables to be classified as categorical if they have less unique values than the value of 'numeric_cutoff'. Date, POSIXct and POSIXlt are the only classes treated as date.

Value

a list with a global summary, and if available, information on numeric, categorical and date variables

Examples

1
2
3
4
5
6
7
8
data(iris)
res <- data_quality(iris)
# global quality
res$global
# numerical data summary
res$numeric_output
# categorical data summary
res$categorical_output

MathieuMarauri/auditdata documentation built on March 6, 2020, 7:09 p.m.