validate_data: Validate Input Data for SQI Analysis

View source: R/validate.R

validate_dataR Documentation

Validate Input Data for SQI Analysis

Description

Checks that a data frame meets requirements for Soil Quality Index (SQI) computation: correct column types, sufficient sample sizes, absence of infinite values, and appropriate variable configuration.

Usage

validate_data(
  data,
  group_cols = NULL,
  config = NULL,
  min_n = 3,
  verbose = TRUE
)

Arguments

data

A data frame. The first column(s) should be grouping factors (character or factor); remaining columns should be numeric soil variables.

group_cols

Character vector. Names of grouping columns (e.g., c("LandUse", "Depth")). Defaults to the first column.

config

A data frame produced by make_config or manually created, with columns variable, type, opt_low, opt_high, min_val, max_val. If NULL, only basic data checks are performed.

min_n

Integer. Minimum number of observations per group. Default is 3.

verbose

Logical. If TRUE (default), prints a validation summary to the console.

Value

Invisibly returns a list with components:

valid

Logical. TRUE if all checks pass.

messages

Character vector of warning/info messages.

n_per_group

Data frame of group sizes.

References

Andrews, S.S., Karlen, D.L., & Cambardella, C.A. (2004). The soil management assessment framework: A quantitative soil quality evaluation method. Soil Science Society of America Journal, 68(6), 1945–1962. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2136/sssaj2004.1945")}

Examples

data(soil_data)
result <- validate_data(soil_data, group_cols = c("LandUse", "Depth"))
result$valid
result$n_per_group


SQIpro documentation built on April 20, 2026, 5:06 p.m.