knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = FALSE)
rtransparency is a pattern-based detector. It is designed for high precision on
the statements it targets, and its predictions come with the exact text that
triggered them so they can be audited. This vignette describes what each
indicator does and does not capture, so results are interpreted correctly.
| Indicator | Detects | Does not mean | |---|---|---| | Conflicts of interest | A COI disclosure is present (including "the authors declare no competing interests") | That a conflict exists | | Funding | A statement that funding was received | Presence of a funding section (a "no funding" section is read as absence) | | Registration | A protocol/trial registration identifier or statement | Ethics/IRB approval numbers | | Novelty | The article claims its own work is novel or first | That the work is objectively novel | | Replication | A replication or external/independent validation was performed | An internal train/test split, or future/recommended validation | | Data sharing | The authors' own data are made available (repository, accession, or in-article) | Data merely reused, cited, or available "upon request" | | Code sharing | The authors' own analysis code is shared | Use of third-party software/tools | | AI disclosure | A statement discloses generative-AI use in manuscript preparation (including "no AI was used") | Use of AI as a research method |
Conflicts of interest and AI disclosure are disclosure-based: a statement addressing the topic counts as present, whether the disclosure is positive or negative. This mirrors how these are reported and counted in the literature.
inst/benchmark/.rt_ai() is a special
case: with no publication date and no section structure available, it applies
no 2023 year gate (it never returns NA) and scans the whole document, so
the caller must restrict it to 2023-or-later articles and tolerate a higher
false-positive rate on AI-method papers than rt_ai_pmc().rt_summary() can correct apparent prevalence using
bundled sensitivity/specificity estimates (rt_accuracy). These derive from
the validation benchmarks; supply your own via rt_summary(accuracy = ) when
you have study-specific estimates. AI disclosure is reported uncorrected (its
prevalence is too low in unselected literature for a stable estimate).Every per-article detector returns the prediction columns is_coi_pred,
is_fund_pred, is_register_pred, is_novelty_pred, is_replication_pred,
is_open_data, is_open_code, and the year-gated is_ai_pred (NA before
2023), each paired with the extracted text. rt_all_pmc() returns all eight for
one file; rt_all_pmc_dir() runs a whole directory.
library(rtransparency) res <- rt_all_pmc("article.xml", remove_ns = TRUE) res[, c("is_coi_pred", "is_fund_pred", "is_open_data", "is_open_code")]
The data- and code-availability links the detector extracts
(open_data_links, open_code_links) can be passed to FAIR-assessment tooling
such as rfair, a native R implementation of
FAIR data and software assessment, to score the findability and accessibility of
the shared resources.
res <- rt_all_pmc("article.xml", remove_ns = TRUE) links <- strsplit(res$open_data_links, " ; ")[[1]] # rfair::assess_fair(links)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.