dq_report2 | R Documentation |
Generate a full DQ report, v2
dq_report2(
study_data,
meta_data = "item_level",
label_col = LABEL,
meta_data_segment = "segment_level",
meta_data_dataframe = "dataframe_level",
meta_data_cross_item = "cross-item_level",
meta_data_v2,
...,
dimensions = c("Completeness", "Consistency"),
cores = list(mode = "socket", logging = FALSE, cpus = util_detect_cores(),
load.balancing = TRUE),
specific_args = list(),
advanced_options = list(),
author = prep_get_user_name(),
title = "Data quality report",
subtitle = as.character(Sys.Date()),
user_info = NULL,
debug_parallel = FALSE,
resp_vars = character(0),
filter_indicator_functions = character(0),
filter_result_slots = c("^Summary", "^Segment", "^DataTypePlotList",
"^ReportSummaryTable", "^Dataframe", "^Result", "^VariableGroup"),
mode = c("default", "futures", "queue", "parallel"),
mode_args = list(),
notes_from_wrapper = list()
)
study_data |
data.frame the data frame that contains the measurements |
meta_data |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data_segment |
data.frame – optional: Segment level metadata |
meta_data_dataframe |
data.frame – optional: Data frame level metadata |
meta_data_cross_item |
data.frame – optional: Cross-item level metadata |
meta_data_v2 |
character path to workbook like metadata file, see
|
... |
arguments to be passed to all called indicator functions if applicable. |
dimensions |
dimensions Vector of dimensions to address in the report. Allowed values in the vector are Completeness, Consistency, and Accuracy. The generated report will only cover the listed data quality dimensions. Accuracy is computational expensive, so this dimension is not enabled by default. Completeness should be included, if Consistency is included, and Consistency should be included, if Accuracy is included to avoid misleading detections of e.g. missing codes as outliers, please refer to the data quality concept for more details. Integrity is always included. |
cores |
integer number of cpu cores to use or a named list with arguments for parallelMap::parallelStart or NULL, if parallel has already been started by the caller. Can also be a cluster. |
specific_args |
list named list of arguments specifically for one of the called functions, the of the list elements correspond to the indicator functions whose calls should be modified. The elements are lists of arguments. |
advanced_options |
list options to set during report computation,
see |
author |
character author for the report documents. |
title |
character optional argument to specify the title for the data quality report |
subtitle |
character optional argument to specify a subtitle for the data quality report |
user_info |
list additional info stored with the report, e.g., comments, title, ... |
debug_parallel |
logical print blocks currently evaluated in parallel |
resp_vars |
variable list the name of the measurement variables for the report. If missing, all variables will be used. Only item level indicator functions are filtered, so far. |
filter_indicator_functions |
character regular expressions, only if an indicator function's name matches one of these, it'll be used for the report. If of length zero, no filtering is performed. |
filter_result_slots |
character regular expressions, only if an indicator function's result's name matches one of these, it'll be used for the report. If of length zero, no filtering is performed. |
mode |
character work mode for parallel execution. default is
"default", the values mean:
- default: use |
mode_args |
list of arguments for the selected |
notes_from_wrapper |
list a list containing notes about changed labels
by |
See dq_report_by for a way to generate stratified or splitted reports easily.
a dataquieR_resultset2 that can be
printed creating a HTML
-report.
as.data.frame.dataquieR_resultset
as.list.dataquieR_resultset
print.dataquieR_resultset
summary.dataquieR_resultset
dq_report_by
## Not run:
prep_load_workbook_like_file("inst/extdata/meta_data_v2.xlsx")
meta_data <- prep_get_data_frame("item_level")
meta_data_cross <- prep_get_data_frame("cross-item_level")
x <- dq_report2("study_data", dimensions = NULL, label_col = "LABEL")
xx <- pbapply::pblapply(x, util_eval_to_dataquieR_result, env = environment())
xx <- pbapply::pblapply(tail(x), util_eval_to_dataquieR_result, env = environment())
xx <- parallel
cat(vapply(x, deparse1, FUN.VALUE = character(1)), sep = "\n", file = "all_calls.txt")
rstudioapi::navigateToFile("all_calls.txt")
eval(x$`acc_multivariate_outlier.Blood pressure checks`)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.