banff_dataset_evaluate: Evaluates the format and content of the input dataset

View source: R/04_assessment.R

banff_dataset_evaluateR Documentation

Evaluates the format and content of the input dataset

Description

This function takes a dataset and evaluates its format and content based on the accepted format specified in the data dictionary. It applies a series of checks to make sure the dataset is ready to be processed by the add_diagnoses() function which assigns diagnoses to each observation of the dataset. The function evaluates whether:

  • The input file is a dataset

  • All mandatory variables are present in the dataset

  • Missing values (NA) are present in variables where they are not allowed

  • Data types are correct.

  • The combination of ID, center, and biopsy date is unique

  • There are duplicated variable in the dataset

  • Dates are valid

  • Content values follow the category values as specified in the data dictionary

  • Constraints specified in the data dictionary are respected

Usage

banff_dataset_evaluate(banff_dataset, version = NULL)

Arguments

banff_dataset

A tibble object.

version

A character string referring the version of Banff classification. The most recent classification is the default. Options are "2022" (default), "2017".

Value

A list of tibble objects giving information on the assessment of the dataset.

Examples

{

banff_dataset <- get_banff_template()
banff_dataset_evaluate(banff_dataset)

}


banffIT documentation built on Aug. 8, 2025, 7:32 p.m.