View source: R/etl_qa_run_pipeline.R
etl_qa_initial_results | R Documentation |
This function performs the core analysis for the ETL QA pipeline, processing
data based on the provided configuration. It is the second step run by
etl_qa_run_pipeline
.
etl_qa_initial_results(config)
config |
An S3 object of class "qa_data_config" containing configuration settings. |
This is an internal function accessible only by use of :::
, for example,
apde:::etl_qa_initial_results(...)
.
A list of raw analytic output. The table structure may differ slightly depending on the original data_source. The list items include:
missing_data |
The proportion of missing data for each variable and time point |
vals_continuous |
The minimum, median, mean, and maximum for all numeric variables with > 6 distinct values |
vals_date |
The minimum, median, and maximum for all date / datetime variables with > 6 distinct values |
vals_categorical |
A frequency table of the top 8 most frequent values
of categorical variable (and numerics or dates with <= 6 distinct values) PLUS a rows for |
chi_standards |
Comparison of CHI (Community Health Indicator) variables values with those expected based on |
## Not run:
# Step 1: generate a config object
myconfig <- etl_qa_setup_config(
data_source_type = 'rads',
data_params = list(
function_name = 'get_data_birth',
time_var = 'chi_year',
time_range = c(2021, 2022),
cols = c('chi_age', 'race4', 'birth_weight_grams', 'birthplace_city',
'num_prev_cesarean', 'mother_date_of_birth'),
version = 'final',
kingco = FALSE,
check_chi = FALSE
),
output_directory = 'C:/temp/'
)
# Step 2: perform the calculations
initial_results <- etl_qa_initial_results(myconfig)
# Peek at the tables
head(initial_results$missing_data)
head(initial_results$vals_categorical)
head(initial_results$vals_continuous)
head(initial_results$vals_date)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.