Modular Approach
In datacutr: SDTM Datacut

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

This article describes how to cut study SDTM data using a modular approach to enable any further study or project specific customization.

Programming Flow

Read in Data
Create DCUT Dataset
Preprocess Datasets
Specify Cut Types
Patient Cut
Date Cut
DM Cut
Apply Cut
Output Final List of Cut Datasets

Read in Data {#readdata}

To start, all SDTM data to be cut needs to be stored in a list.

library(datacutr)
library(admiraldev)
library(dplyr)
library(lubridate)
library(stringr)
library(purrr)
library(rlang)

source_data <- list(
  ds = datacutr_ds, dm = datacutr_dm, ae = datacutr_ae, sc = datacutr_sc,
  lb = datacutr_lb, fa = datacutr_fa, ts = datacutr_ts
)

Create DCUT Dataset {#dcut}

The next step is to create the DCUT dataset containing the datacut date and description.

dcut <- create_dcut(
  dataset_ds = source_data$ds,
  ds_date_var = DSSTDTC,
  filter = DSDECOD == "RANDOMIZATION",
  cut_date = "2022-06-04",
  cut_description = "Clinical Cutoff Date"
)

dataset_vignette(
  dcut,
  display_vars = exprs(USUBJID, DCUTDTC, DCUTDTM, DCUTDESC)
)

Preprocess Datasets {#preprocess}

If any pre-processing of datasets is needed, for example in the case of FA, where there are multiple date variables, this should be done next.

source_data$fa <- source_data$fa %>%
  mutate(DCUT_TEMP_FAXDTC = case_when(
    FASTDTC != "" ~ FASTDTC,
    FADTC != "" ~ FADTC,
    TRUE ~ as.character(NA)
  ))

dataset_vignette(
  source_data$fa,
  display_vars = exprs(USUBJID, FASTDTC, FADTC, DCUT_TEMP_FAXDTC)
)

Specify Cut Types {#cuttypes}

We'll next specify the cut types for each dataset (patient cut, date cut or no cut) and in the case of date cut which date variable should be used.

patient_cut_list <- c("sc", "ds")

date_cut_list <- rbind(
  c("ae", "AESTDTC"),
  c("lb", "LBDTC"),
  c("fa", "DCUT_TEMP_FAXDTC")
)

no_cut_list <- list(ts = source_data$ts)

Patient Cut {#ptcut}

Next we'll apply the patient cut.

patient_cut_data <- lapply(
  source_data[patient_cut_list], pt_cut,
  dataset_cut = dcut
)

This adds on temporary flag variables indicating which observations will be removed, for example for SC:

dataset_vignette(
  patient_cut_data$sc,
  display_vars = exprs(USUBJID, SCORRES, DCUT_TEMP_REMOVE)
)

Date Cut {#dtcut}

Next we'll apply the date cut.

date_cut_data <- pmap(
  .l = list(
    dataset_sdtm = source_data[date_cut_list[, 1]],
    sdtm_date_var = syms(date_cut_list[, 2])
  ),
  .f = date_cut,
  dataset_cut = dcut,
  cut_var = DCUTDTM
)

This again adds on temporary flag variables indicating which observations will be removed, for example for AE:

dataset_vignette(
  date_cut_data$ae,
  display_vars = exprs(
    USUBJID,
    AETERM,
    AESTDTC,
    DCUT_TEMP_SDTM_DATE,
    DCUT_TEMP_DCUTDTM,
    DCUT_TEMP_REMOVE
  )
)

DM Cut {#dmcut}

Then lastly we'll apply the special DM cut which also updates the death related variables.

dm_cut <- special_dm_cut(
  dataset_dm = source_data$dm,
  dataset_cut = dcut,
  cut_var = DCUTDTM
)

This adds on temporary variables indicating any death records that would change as a result of applying a datacut:

dataset_vignette(
  dm_cut,
  display_vars = exprs(
    USUBJID,
    DTHFL,
    DTHDTC,
    DCUT_TEMP_REMOVE,
    DCUT_TEMP_DTHDT,
    DCUT_TEMP_DCUTDTM,
    DCUT_TEMP_DTHCHANGE
  )
)

Apply Cut {#applycut}

The last step is to create the RMD report, to summarize which patients and observations will be cut, and then apply the cut to strip out all observations flagged as to be removed.

cut_data <- purrr::map(
  c(patient_cut_data, date_cut_data, list(dm = dm_cut)),
  apply_cut,
  dcutvar = DCUT_TEMP_REMOVE,
  dthchangevar = DCUT_TEMP_DTHCHANGE
)

Output Final List of Cut Datasets {#output}

Lastly, we create the final list of all the cut SDTM data, adding in the SDTM where no cut was needed.

final_data <- c(cut_data, no_cut_list, list(dcut = dcut))

Any scripts or data that you put into this service are public.

datacutr documentation built on April 3, 2025, 10:52 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

datacutr
SDTM Datacut

Modular Approach
In datacutr: SDTM Datacut

Introduction

Programming Flow

Read in Data {#readdata}

Create DCUT Dataset {#dcut}

Preprocess Datasets {#preprocess}

Specify Cut Types {#cuttypes}

Patient Cut {#ptcut}

Date Cut {#dtcut}

DM Cut {#dmcut}

Apply Cut {#applycut}

Output Final List of Cut Datasets {#output}

Try the datacutr package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

datacutr SDTM Datacut

Modular Approach In datacutr: SDTM Datacut

Introduction

Programming Flow

Read in Data {#readdata}

Create DCUT Dataset {#dcut}

Preprocess Datasets {#preprocess}

Specify Cut Types {#cuttypes}

Patient Cut {#ptcut}

Date Cut {#dtcut}

DM Cut {#dmcut}

Apply Cut {#applycut}

Output Final List of Cut Datasets {#output}

Try the datacutr package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

datacutr
SDTM Datacut

Modular Approach
In datacutr: SDTM Datacut