REDCapDM - Queries"

rm(list = ls())
library(REDCapDM) 
library(kableExtra)
library(knitr)
library(dplyr)
library(magrittr)
library(purrr)

covican_transformed <- rd_transform(covican)


This vignette provides a summary of the simple and common use of REDCapDM to identify discrepancies in REDCap data imported into R.


Queries

Queries are crucial for the accuracy and reliability of a REDCap dataset. They help identify missing values, inconsistencies, and potential errors in the collected data. The rd_query() function allows you to generate queries using a specific expression.

To identify missing values in certain variables, simply provide the relevant information to the variables and expression arguments. In this scenario, the expression would be 'is.na(x)', where 'x' represents the variable itself:

example <- rd_query(covican_transformed,
                    variables = "copd",
                    expression = "is.na(x)")

Note: For variables with branching logic, the function will automatically apply the associated branching logic or at least report it.


Alternatively, to identify outliers or observations that meet a certain condition (for example, range):

example <- rd_query(covican_transformed,
                    variables = c("age", "potassium"),
                    expression = c("x > 80", "x > 4.2 & x < 4.3"),
                    event = "baseline_visit_arm_1")


In both cases, the function returns a list containing a data frame designed to aid you to locate each query in the REDCap project:

example$queries
kable(head(example$queries, 2)) %>% 
  kableExtra::row_spec(0, bold = TRUE) %>% 
  kableExtra::kable_styling()

And a summary of the generated queries per specified variable for each applied expression:

example$results


For longitudinal projects, the rd_event() allows you to check if a particular event is missing from a record in the exported data. This happens in REDCap when there is no collected data in a particular event from a record, as REDCap will not export the corresponding row. To identify these cases, you can use the following code:

example <- rd_event(covican_transformed,
                    event = "follow_up_visit_da_arm_1")



Control

After identifying queries, it is common practice to correct the original dataset in REDCap and re-run the query process for a new query dataset.

The check_queries() functiona allows you to compare the previous query dataset with the new one:

example <- rd_query(covican_transformed,
                    variables = c("copd", "age"),
                    expression = c("is.na(x)", "is.na(x)"),
                    event = "baseline_visit_arm_1")
new_example <- example
new_example$queries <- as.data.frame(new_example$queries)
new_example$queries <- new_example$queries[c(1:5, 10:11),] # We take only some of the previously created queries
new_example$queries[nrow(new_example$queries) + 1,] <- c("100-79", "Hospital 11", "Baseline visit", "Comorbidities", "copd", "-", "Chronic obstructive pulmonary disease", "The value is NA and it should not be missing", "100-79-4") # we create a new query
new_example$queries[nrow(new_example$queries) + 1, ] <- c("105-56", "Hospital 5", "Baseline visit", "Demographics", "age", "-", "Age", "The value is 80 and it should not be >70", "105-56-2")
check <- check_queries(old = example$queries, 
                       new = new_example$queries)

The output, in addition to the query data frame, now includes a summary with the number of new, miscorrected, solved and pending queries:

# Print results
check$results

Note: The "Miscorrected" category includes queries that belong to the same combination of record identifier and variable in both the old and new reports, but with a different reason. For instance, if a variable had a missing value in the old report, but in the new report shows a value outside the established range, it would be classified as "Miscorrected".



Export

With the help of the rd_export() function, you can export the identified queries to a .xlsx file of your choice:

example <- rd_query(covican_transformed,
                    variables = c("copd", "age"),
                    expression = c("is.na(x)", "is.na(x)"),
                    event = "baseline_visit_arm_1")
rd_export(example)

This is the simplets way to use the function and will create a file named "example.xlsx" in your current working directory, but you can customise this exported file:

rd_export(queries = example$queries,
          column = "Link",
          sheet_name = "Queries - Proyecto",
          path = "C:/User/Desktop/queries.xlsx",
          password = "123") 

In both cases, a message will be generated in the console informing you that the file has been created and where it is located.



For more information, consult the complete vignette available at: https://bruigtp.github.io/REDCapDM/articles/REDCapDM.html





Try the REDCapDM package in your browser

Any scripts or data that you put into this service are public.

REDCapDM documentation built on June 22, 2024, 12:02 p.m.