r4subtrace is the traceability engine in the R4SUB ecosystem. It quantifies and explains end-to-end traceability between clinical submission artifacts -- primarily ADaM outputs <-> derivations <-> SDTM sources <-> specs <-> code -- and converts trace evidence into standardized R4SUB Evidence Table rows (from r4subcore).
It focuses on answering one question:
Can we prove where each analysis variable/value came from, and can a reviewer follow it?
In real submissions, issues are rarely "a single failed rule." Many are trace failures: - Missing or ambiguous derivation documentation - ADaM variable not linkable to SDTM sources - Mismatch between spec and what code produces - Inconsistent naming across specs, define.xml, and datasets - Reviewer cannot reproduce or validate lineage
r4subtrace formalizes traceability as evidence + measurable indicators.
pak::pak(c("R4SUB/r4subcore", "R4SUB/r4subtrace"))
library(r4subcore)
library(r4subtrace)
ctx <- r4sub_run_context(study_id = "ABC123", environment = "DEV")
adam_meta <- read.csv("adam_metadata.csv") # columns: dataset, variable, label, type
sdtm_meta <- read.csv("sdtm_metadata.csv") # same structure
map <- read.csv("trace_map.csv")
# recommended columns:
# adam_dataset, adam_var, sdtm_domain, sdtm_var, derivation_text(optional), confidence(optional)
tm <- build_trace_model(
adam_meta = adam_meta,
sdtm_meta = sdtm_meta,
mapping = map
)
ev <- trace_model_to_evidence(tm, ctx = ctx, source_name = "r4subtrace", source_version = "0.1.0")
validate_evidence(ev)
evidence_summary(ev)
ind <- trace_indicator_scores(ev)
ind
A list with:
nodes: tidy table of assets (dataset/variable/spec/program)edges: tidy table of relationships + confidencediagnostics: issues found (orphans, ambiguities, conflicts)Evidence rows are emitted for:
TRACE_VAR_COVERAGE_L2PLUS: proportion of ADaM variables with L2+ traceTRACE_VAR_COVERAGE_L3PLUS: proportion with L3+ traceTRACE_ORPHAN_VAR_COUNT: orphan ADaM vars with no SDTM mappingTRACE_AMBIGUOUS_MAPPING_COUNT: vars mapped to multiple SDTM sourcesTRACE_MEAN_TRACE_LEVEL: mean trace level across all ADaM variablesMIT
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.