make_exclusions: Perform Sequential Exclusions

View source: R/flowex.R

make_exclusionsR Documentation

Perform Sequential Exclusions

Description

This function performs sequential, user-defined filter steps on the input data set. It generates the filtered data and a tibble that can be directly passed on to exclusion_flowchart to plot a flowchart of exclusions.

Usage

make_exclusions(criteria, data)

Arguments

criteria

Tibble with filtering criteria. Must contain three variables:

  • left String with description of data before applying the filter.

  • right String with description of data after applying the filter.

  • filter Filtering expression quoted using expr. The filter in the last row will not be executed, because the last row serves as a description of the final data set. See examples.

data

Tibble with data set on which the filtering criteria should be applied.

Value

A tibble. Each row is a filtering step. Variables:

  • left: Labels for included subset that is "left" after the filter.

  • right: Labels for excluded subset (which exclusion_flowchart plots to the right).

  • included The data before applying the row's filter.

  • excluded The data after applying the row's filter.

  • n_left Number of observations before applying the row's filter.

  • n_right Number of observations after applying the row's filter.

The last row, included, contains the data after applying all filters. Access this tibble using %>% pull(included) %>% last().

Example Output

make_exclusions.png

Examples

# Example data set
data(cancer, package = "survival")
cancer <- cancer %>% tibble::as_tibble()
cancer

# Define exclusion criteria
criteria <- tibble::tribble(
  ~left,                   ~right,                ~filter,
  "All patients",          "Missing ECOG status", expr(!is.na(ph.ecog)),
  "Known ECOG",            "Exclude men",         expr(sex == 2),
  "Analytical population", "",                    expr(TRUE))

# Alternative, equivalent approach to defining the criteria
# Note the use of list() around expr(...)
criteria <- dplyr::bind_rows(
  tibble::tibble(
    left = "All patients",
    right = "Missing ECOG status",
    filter = list(expr(!is.na(ph.ecog)))),
  tibble::tibble(
    left = "Known ECOG",
    right = "Exclude men",
    filter = list(expr(sex == 2))),
  tibble::tibble(
    left = "Analytical population",
    right = "",
    filter = list(expr(TRUE))))

# Perform sequential exclusions
result <- make_exclusions(
  criteria = criteria,
  data = cancer)

# Show results
result

# Access study population after all exclusions
result %>%
  dplyr::pull(included) %>%
  dplyr::last()

# Plot flow chart of exclusions (might not display in the online reference)
result %>%
  exclusion_flowchart()


stopsack/khsmisc documentation built on Sept. 22, 2023, 12:26 p.m.