dbFindIdsUniqueTrials: Get identifiers of deduplicated trial records

View source: R/dbFindIdsUniqueTrials.R

dbFindIdsUniqueTrialsR Documentation

Get identifiers of deduplicated trial records

Description

Records for a clinical trial can be loaded from more than one register into a collection. This function returns deduplicated identifiers for all trials in the collection, respecting the register(s) preferred by the user. All registers are recording identifiers also from other registers, which are used by this function to provide a vector of identifiers of deduplicated trials.

Usage

dbFindIdsUniqueTrials(
  preferregister = c("CTGOV2", "EUCTR", "CTGOV", "ISRCTN", "CTIS"),
  prefermemberstate = "BE",
  include3rdcountrytrials = TRUE,
  con,
  verbose = FALSE
)

Arguments

preferregister

A vector of the order of preference for registers from which to generate unique _id's, default c("CTGOV2", "EUCTR", "CTGOV", "ISRCTN", "CTIS")

prefermemberstate

Code of single EU Member State for which records should returned. If not available, a record for BE or lacking this, any random Member State's record for the trial will be returned. For a list of codes of EU Member States, please see vector countriesEUCTR. Specifying "3RD" will return the Third Country record of trials, where available.

include3rdcountrytrials

A logical value if trials should be retained that are conducted exclusively in third countries, that is, outside the European Union. Ignored if prefermemberstate is set to "3RD".

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

verbose

If TRUE, prints out the fields of registers used to find corresponding trial records

Details

Note that the content of records may differ between registers (and, for "EUCTR", between records for different Member States). Such differences are not considered by this function.

Note that the trial concept ".isUniqueTrial" (which uses this function) can be calculated at the time of creating a data frame with dbGetFieldsIntoDf, which often may be the preferred approach.

Value

A named vector with strings of keys (field "_id") of records in the collection that represent unique trials, where names correspond to the register of the record.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dbFindIdsUniqueTrials(con = dbc)[1:10]

# alternative as of ctrdata version 1.21.0,
# using defaults of dbFindIdsUniqueTrials()
df <- dbGetFieldsIntoDf(
  fields = "keyword",
  calculate = "f.isUniqueTrial",
  con = dbc)

# using base R
df[df[[".isUniqueTrial"]], ]

## Not run: 
library(dplyr)
df %>% filter(.isUniqueTrial)

## End(Not run)


ctrdata documentation built on April 3, 2025, 8:12 p.m.