etl_qa_initial_results: Initial QA results for ETL QA pipeline

View source: R/etl_qa_run_pipeline.R

etl_qa_initial_resultsR Documentation

Initial QA results for ETL QA pipeline

Description

This function performs the core analysis for the ETL QA pipeline, processing data based on the provided configuration. It is the second step run by etl_qa_run_pipeline.

Usage

etl_qa_initial_results(config)

Arguments

config

An S3 object of class "qa_data_config" containing configuration settings.

Details

This is an internal function accessible only by use of :::, for example, apde:::etl_qa_initial_results(...).

Value

A list of raw analytic output. The table structure may differ slightly depending on the original data_source. The list items include:

missing_data

The proportion of missing data for each variable and time point

vals_continuous

The minimum, median, mean, and maximum for all numeric variables with > 6 distinct values

vals_date

The minimum, median, and maximum for all date / datetime variables with > 6 distinct values

vals_categorical

A frequency table of the top 8 most frequent values of categorical variable (and numerics or dates with <= 6 distinct values) PLUS a rows for NA PLUS a row for all 'Other values'

chi_standards

Comparison of CHI (Community Health Indicator) variables values with those expected based on rads.data::misc_chi_byvars

Examples

## Not run: 

# Step 1: generate a config object 
myconfig <- etl_qa_setup_config(
  data_source_type = 'rads',
  data_params = list(
    function_name = 'get_data_birth',
    time_var = 'chi_year',
    time_range = c(2021, 2022),
    cols = c('chi_age', 'race4', 'birth_weight_grams', 'birthplace_city', 
             'num_prev_cesarean', 'mother_date_of_birth'),
    version = 'final', 
    kingco = FALSE, 
    check_chi = FALSE
  ), 
  output_directory = 'C:/temp/'
)


# Step 2: perform the calculations
initial_results <- etl_qa_initial_results(myconfig)

# Peek at the tables
head(initial_results$missing_data)
head(initial_results$vals_categorical)
head(initial_results$vals_continuous)
head(initial_results$vals_date)


## End(Not run)


PHSKC-APDE/apde documentation built on April 14, 2025, 10:46 a.m.