Filter helpers
In visOmopResults: Graphs and Tables for OMOP Results

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
options(rmarkdown.html_vignette.check_title = FALSE)

Introduction

Dealing with an <summarised_result> object can be difficult to handle specially when we are trying to filter. For example, difficult tasks would be to filter to a certain result_type or when there are many strata joined together filter only one of the variables. On the other hand it exists the tidy format that makes it easy to filter, but then you loose the <summarised_result> object. There exist some functions that only work with <summarised_result> objects.

visOmopResults package contains some functionalities that helps on this process:

filterSettings to filter the <summarised_result> object using the settings() attribute.
filterGroup to filter the <summarised_result> object using the group_name-group_level tidy columns.
filterStrata to filter the <summarised_result> object using the strata_name-starta_level tidy columns.
filterAdditional to filter the <summarised_result> object using the additional_name-additional_level tidy columns.

In this vignette we will also cover two types of utility functions:

unite* type functions: to join multiple columns in the name-level structure.
*Columns type functions: to identify the columns that are contained in a name-level structure.

Now we will see some examples.

filterSettings

For this example we will use some mock data.

library(visOmopResults)
library(dplyr, warn.conflicts = FALSE)

Let's generate two sets of results:

result1 <- mockSummarisedResult()
result2 <- mockSummarisedResult()

We can change the settings of the second set of results simulating that results come from a different package ans set of results:

result2 <- result2 |>
  omopgenerics::newSummarisedResult(settings = tibble(
    result_id = 1L,
    result_type = "second_mock_result",
    package_name = "omopgenerics",
    package_version = "1.0.0",
    my_parameter = TRUE
  ))

We can now merge both results in a unique <summarised_result> object:

result <- bind(result1, result2)
settings(result)
result

You could potentially add the settings with addSettings(), then filter and finally eliminate the columns. Let's see an example where we subset to the results that have my_parameter == TRUE:

resultMyParam <- result |>
  addSettings(settingsColumns = "my_parameter") |>
  filter(my_parameter == TRUE) |>
  select(!"my_parameter")
resultMyParam

This approach has some problems:

It is not efficient.
We have to use three different functions.
The settings attribute still contains both sets:

settings(resultMyParam)

We can do the same solving the three problems using filterSettings():

resultMyParam <- result |>
  filterSettings(my_parameter == TRUE)
resultMyParam
settings(resultMyParam)

filterStrata

Using the same mock data we can try to filter the rows that contain data related to 'Female', the problematic with the strata_name-strata_level display is that it is difficult to easily filter the "Female" columns:

result |>
  select(strata_name, strata_level) |>
  distinct()

One option that we could use is splitStrata(), filter() and then uniteStrata() again:

result |>
  splitStrata() |>
  filter(sex == "Female") |>
  uniteStrata(c("age_group", "sex"))

Problem of this is that:

It is extremely inefficient (all the rows must be splitted and united back).
You need to know which are the strata columns (you could potentially use strataColumns()).
You have to use multiple functions.

We could do exactly the same with the function filterStrata():

result |>
  filterStrata(sex == "Female")

filterGroup() and filterAdditional() work exactly in the same way than filterStrata() but with their analogous columns.

A nice functionality that you may have is that is that if you filter by a column/setting that does not exist the output will be warning + return emptySummarisedResult() which can be quite hekpful in some occasions.

result |>
  filterSettings(setting_that_does_not_exist == 1)

unite functions

In this previous section we have mentioned the function uniteStrata() without explaining its functionality so let's cover it with a couple of examples. The uniteGroup(), uniteStrata() and uniteAdditional() functions are the opposite to the split*() type functions:

result |>
  splitStrata()

result |>
  splitStrata() |>
  uniteStrata(c("age_group", "sex"))

Note that missing will be not included (if all missing: overall-overall will be considered), for example:

dplyr::tibble(
  age_group = c(NA, NA, "<40", ">40"),
  sex = c(NA, "Male", "Female", NA),
  year = c(NA, NA, 2010L, 2011L)
) |> 
  gt::gt()

With uniteStrata(cols = c("age_group", "sex", "year")) the output would be:

dplyr::tibble(
  strata_name = c("overall", "sex", "age_group &&& sex &&& year", "age_group &&& year"),
  strata_level = c("overall", "Male", "<40 &&& Female &&& 2010", ">40 &&& 2011")
) |> 
  gt::gt()

Note that is we split again then year will be a character vector instead of an integer. In future releases conserving type may be possible.

uniteGroup() and uniteAdditional() work exactly in the same way than uniteStrata() but with their analogous columns.

Columns

Splitting and tidying your <summarised_result> can give you many advantages, as we have seen across the different vignettes. One of the problems that you may face is the ability to know which are the available settings to add, or how many columns will be generated when splitting group. That's why visOmopResults created some helper functions:

settingsColumns() gives you the setting names that are available in a <summarised_result> object.
groupColumns() gives you the new columns that will be generated when splitting group_name-group_level pair into different columns.
strataColumns() gives you the new columns that will be generated when splitting strata_name-strata_level pair into different columns.
additionalColumns() gives you the new columns that will be generated when splitting additional_name-additional_level pair into different columns.
tidyColumns() gives you the columns that will have the object if you tidy it (tidy(result)). This function in very useful to know which are the columns that can be included in plot and table functions.

Let's see the different values with out example mock data set:

settingsColumns(result)
groupColumns(result)
strataColumns(result)
additionalColumns(result)
tidyColumns(result)

Any scripts or data that you put into this service are public.

visOmopResults documentation built on Sept. 24, 2024, 1:08 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

visOmopResults
Graphs and Tables for OMOP Results

Filter helpers
In visOmopResults: Graphs and Tables for OMOP Results

Introduction

filterSettings

filterStrata

unite functions

Columns

Try the visOmopResults package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

visOmopResults Graphs and Tables for OMOP Results

Filter helpers In visOmopResults: Graphs and Tables for OMOP Results

Introduction

filterSettings

filterStrata

unite functions

Columns

Try the visOmopResults package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

visOmopResults
Graphs and Tables for OMOP Results

Filter helpers
In visOmopResults: Graphs and Tables for OMOP Results