prune_occurrences: Occurrence table pruning

prune_occurrencesR Documentation

Occurrence table pruning

Description

[Stable]

Family of constructor and condition functions to flexibly prune occurrence tables. The condition functions always return whether the row result is higher than the threshold. Since they are of class CombinationFunction() they can be logically combined with other condition functions.

Usage

keep_rows(row_condition)

keep_content_rows(content_row_condition)

has_count_in_cols(atleast, ...)

has_count_in_any_col(atleast, ...)

has_fraction_in_cols(atleast, ...)

has_fraction_in_any_col(atleast, ...)

has_fractions_difference(atleast, ...)

has_counts_difference(atleast, ...)

Arguments

row_condition

(CombinationFunction)
condition function which works on individual analysis rows and flags whether these should be kept in the pruned table.

content_row_condition

(CombinationFunction)
condition function which works on individual first content rows of leaf tables and flags whether these leaf tables should be kept in the pruned table.

atleast

(numeric(1))
threshold which should be met in order to keep the row.

...

arguments for row or column access, see rtables_access: either col_names (character) including the names of the columns which should be used, or alternatively col_indices (integer) giving the indices directly instead.

Value

  • keep_rows() returns a pruning function that can be used with rtables::prune_table() to prune an rtables table.

  • keep_content_rows() returns a pruning function that checks the condition on the first content row of leaf tables in the table.

  • has_count_in_cols() returns a condition function that sums the counts in the specified column.

  • has_count_in_any_col() returns a condition function that compares the counts in the specified columns with the threshold.

  • has_fraction_in_cols() returns a condition function that sums the counts in the specified column, and computes the fraction by dividing by the total column counts.

  • has_fraction_in_any_col() returns a condition function that looks at the fractions in the specified columns and checks whether any of them fulfill the threshold.

  • has_fractions_difference() returns a condition function that extracts the fractions of each specified column, and computes the difference of the minimum and maximum.

  • has_counts_difference() returns a condition function that extracts the counts of each specified column, and computes the difference of the minimum and maximum.

Functions

  • keep_rows(): Constructor for creating pruning functions based on a row condition function. This removes all analysis rows (TableRow) that should be pruned, i.e., don't fulfill the row condition. It removes the sub-tree if there are no children left.

  • keep_content_rows(): Constructor for creating pruning functions based on a condition for the (first) content row in leaf tables. This removes all leaf tables where the first content row does not fulfill the condition. It does not check individual rows. It then proceeds recursively by removing the sub tree if there are no children left.

  • has_count_in_cols(): Constructor for creating condition functions on total counts in the specified columns.

  • has_count_in_any_col(): Constructor for creating condition functions on any of the counts in the specified columns satisfying a threshold.

  • has_fraction_in_cols(): Constructor for creating condition functions on total fraction in the specified columns.

  • has_fraction_in_any_col(): Constructor for creating condition functions on any fraction in the specified columns.

  • has_fractions_difference(): Constructor for creating condition function that checks the difference between the fractions reported in each specified column.

  • has_counts_difference(): Constructor for creating condition function that checks the difference between the counts reported in each specified column.

Note

Since most table specifications are worded positively, we name our constructor and condition functions positively, too. However, note that the result of keep_rows() says what should be pruned, to conform with the rtables::prune_table() interface.

Examples


tab <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("RACE") %>%
  split_rows_by("STRATA1") %>%
  summarize_row_groups() %>%
  analyze_vars("COUNTRY", .stats = "count_fraction") %>%
  build_table(DM)



# `keep_rows`
is_non_empty <- !CombinationFunction(all_zero_or_na)
prune_table(tab, keep_rows(is_non_empty))


# `keep_content_rows`

more_than_twenty <- has_count_in_cols(atleast = 20L, col_names = names(tab))
prune_table(tab, keep_content_rows(more_than_twenty))



more_than_one <- has_count_in_cols(atleast = 1L, col_names = names(tab))
prune_table(tab, keep_rows(more_than_one))



# `has_count_in_any_col`
any_more_than_one <- has_count_in_any_col(atleast = 1L, col_names = names(tab))
prune_table(tab, keep_rows(any_more_than_one))



# `has_fraction_in_cols`
more_than_five_percent <- has_fraction_in_cols(atleast = 0.05, col_names = names(tab))
prune_table(tab, keep_rows(more_than_five_percent))



# `has_fraction_in_any_col`
any_atleast_five_percent <- has_fraction_in_any_col(atleast = 0.05, col_names = names(tab))
prune_table(tab, keep_rows(any_atleast_five_percent))



# `has_fractions_difference`
more_than_five_percent_diff <- has_fractions_difference(atleast = 0.05, col_names = names(tab))
prune_table(tab, keep_rows(more_than_five_percent_diff))



more_than_one_diff <- has_counts_difference(atleast = 1L, col_names = names(tab))
prune_table(tab, keep_rows(more_than_one_diff))



tern documentation built on Sept. 24, 2024, 9:06 a.m.