check_data_tables | R Documentation |
Read a set of files containing data tables and check them against a data model.
read_data_tables(files, table_names = names(files), quiet = TRUE)
check_table_names(tables, model)
check_column_names(tables, model)
check_column_types(tables, model)
check_column_min_max(tables, model)
check_missing_values(tables, model)
check_unique(tables, model)
check_bucket_paths(tables, model)
check_valid_entity_id(tables, model, report_missing_id = FALSE)
check_primary_keys(tables, model)
check_foreign_keys(tables, model)
parse_column_name_check(chk)
parse_column_type_check(chk)
files |
Vector of file paths, one per data table. |
table_names |
Vector of table names associated with |
quiet |
Logical to control printing results of column parsing from |
tables |
Named list of data tables |
model |
|
report_missing_id |
A logical indicating whether the absence of an entity id is regarded as an error. |
chk |
output of |
read_data_tables
returns a named list of data frames.
check_table_names
returns NULL
if tables
matches model
,
or a list:
missing_tables: Vector of tables in model
but not in tables
extra_tables: Vector of tables in tables
but not in model
check_column_names
return a list of all tables in common between data
and model. Each table element is NULL
if columns in tables
matches model
,
or a list:
missing_required_columns: Vector of required columns in model
but not in tables
missing_optional_columns: Vector of optional columns in model
but not in tables
extra_columns: Vector of columns in tables
but not in model
check_column_types
returns a list of all tables in common between data
and model. Each table element is a list of all columns in common between table and
model. Each column element is NULL
if values in column are a compatible type
with the data model, or a string describing the mismatch.
check_column_types
returns a list of all tables in common between data
and model. Each table element is a list of all columns in common between table and
model that have min and/or max values.
Each column element is NULL
if values in column are between min and max,
or a string describing the mismatch.
check_missing_values
returns a list of all tables in common between data
and model. Each table element is a list of all required columns in common between table and
model. Each column element is NULL
if the column has no missing values, or
the number of missing values in the column. If a condition is set on a column, missing values
are only checked for rows where the condition is met.
check_unique
returns a list of all tables in common between data
and model. Each table element is a list of all columns in common between table and
model also defined as unique by the model. Each column element is NULL
if
the column is unique, or a string listing duplicated elements.
check_bucket_paths
returns a list of all tables in common between data
and model. Each table element is a list of all columns in common between table and
model also defined as containing bucket paths by the model. Each column element is NULL
if
all paths exist, or a string listing paths that do not exist.
check_valid_entity_id
returns a list of all tables in common between data
and model. Each table element is NULL
if the table has a valid AnVIL entity_id, or
a string describing the error.
check_primary_keys
returns a list with two elements:
found_keysresults of dm_examine_constraints
after applying primary keys from model
to tables
missing_keyslist of missing primary keys in each table
check_foreign_keys
returns a list with two elements:
found_keysresults of dm_examine_constraints
after applying foreign keys from model
to tables
missing_keyslist of missing child or parent keys in each table
parse_column_name_check
and parse_column_type_check
each return a tibble with check results suitable for printing
# read data model
json <- system.file("extdata", "data_model.json", package="AnvilDataModels")
model <- json_to_dm(json)
# read tables to check
table_names <- c("subject", "phenotype", "sample", "sample_set", "file")
files <- system.file("extdata", paste0(table_names, ".tsv"), package="AnvilDataModels")
names(files) <- table_names
tables <- read_data_tables(files)
check_table_names(tables, model)
check_column_names(tables, model)
check_column_types(tables, model)
check_primary_keys(tables, model)
check_foreign_keys(tables, model)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.