check_raising: Comprehensive consistency checks between MEDITS 'TB' and 'TC'...

View source: R/check_raising_NEW.r

check_raisingR Documentation

Comprehensive consistency checks between MEDITS TB and TC files

Description

check_raising() performs a suite of integrity checks on a pair of MEDITS survey tables – TB (catch-at-haul) and TC (biological subsamples) – for one survey year. All inconsistencies are written to a plain-text logfile; the routine never stops at the first error, so you always get the full list of issues detected.

Usage

check_raising(ResultDataTB, ResultDataTC, year, wd, suffix = NULL)

Arguments

ResultDataTB

A data.frame containing the MEDITS TB table for one or more years.

ResultDataTC

A data.frame containing the MEDITS TC table for one or more years.

year

Single integer. The survey year to be checked.

wd

Character string. A writable directory where sub-folders ‘Logfiles/’ (and ‘Graphs/’, currently unused) will be created.

suffix

Optional character string appended to the logfile name. When NULL (default) a timestamp-based suffix is generated automatically.

Details

The function executes five independent validations:

  1. Weight consistency – When more than one subsample exists for a given haul/species in TC, the sum of WEIGHT_OF_THE_FRACTION must equal TOTAL_WEIGHT_IN_THE_HAUL recorded in TB.

  2. Raising factor – For each subsample the ratio molt given by WEIGHT_OF_THE_FRACTION / WEIGHT_OF_THE_SAMPLE_MEASURED must be \ge 1.

  3. Sex-specific numbers – For every combination haul * species * sex the raised total of individuals in TC must match TB columns NB_OF_FEMALES, NB_OF_MALES or NB_OF_UNDETERMINED.

  4. Total individuals (TC -> TB) – For every haul/species the sum of raised numbers across all sexes in TC must equal TOTAL_NUMBER_IN_THE_HAUL in TB. This catches cases where an entire sex is missing from TC.

  5. Internal TB consistency – Within TB the sum of the three sex-specific columns must equal TOTAL_NUMBER_IN_THE_HAUL for each haul/species.

If any of these checks fails, an explanatory line is appended to the logfile ‘Logfiles/Logfile_.dat’. The function finally returns a single logical value: TRUE when no errors are detected, FALSE otherwise.

Tidy-evaluation is used inside the dplyr pipelines; the following symbols are declared in globalVariables to avoid "no visible binding" notes during R CMD check: COUNTRY, YEAR, HAUL_NUMBER, GENUS, SPECIES, WEIGHT_OF_THE_FRACTION, WEIGHT_OF_THE_SAMPLE_MEASURED, NUMBER_OF_INDIVIDUALS_IN_THE_LENGTH_CLASS_AND_MATURITY_STAGE, SEX, NB_OF_FEMALES, NB_OF_MALES, NB_OF_UNDETERMINED, TOTAL_WEIGHT_IN_THE_HAUL, TOTAL_NUMBER_IN_THE_HAUL, codedsex, N, molt, raised, RaisedSex, RaisedTotal, n_subsamples, total_fraction, SumSexTB.

Value

Logical scalar: TRUE if the dataset passes all checks, FALSE otherwise.

Author(s)

W. Zupa, I. Bitetto, M. T. Spedicato

See Also

dplyr for the data-manipulation verbs used under the hood.

Examples



# The following datasets come from the 'RoME' package demo data
DataTB <- RoME::TB
DataTC <- RoME::TC

res <- check_raising(DataTB, DataTC, year = 2015, wd = tempdir())
if (res) message("All checks passed!")


RoME documentation built on April 24, 2026, 1:07 a.m.