filter_cv: Filter Features based on their coefficient of variation

View source: R/filters.R

filter_cvR Documentation

Filter Features based on their coefficient of variation

Description

Filters Features based on their coefficient of variation (CV). The CV is defined as CV = \frac{s_i}{\overline{x_i}} with s_i = Standard deviation of sample i and \overline{x_i} = Mean of sample i.

Usage

filter_cv(
  data,
  reference_samples,
  max_cv = 0.2,
  ref_as_group = FALSE,
  group_column = NULL,
  na_as_zero = TRUE
)

Arguments

data

A tidy tibble created by read_featuretable.

reference_samples

The names of the samples or group which will be used to calculate the CV of a feature. Usually Quality Control samples.

max_cv

The maximum allowed CV. 0.2 is a reasonable start.

ref_as_group

A logical indicating if reference_samples are the names of samples or group(s).

group_column

Only relevant if ref_as_group = TRUE. Which column should be used for grouping reference and non-reference samples? Usually group_column = Group. Uses args_data_masking.

na_as_zero

Should NA be replaced with 0 prior to calculation? Under the hood filter_cv calculates the CV by stats::sd(..., na.rm = TRUE) / mean(..., na.rm = TRUE). If there are 3 samples to calculate the CV from and 2 of them are NA for a specific feature, then the CV for that Feature will be NA if na_as_zero = FALSE. This might lead to problems. na_as_zero = TRUE is the safer pick. Zeros will be replaced with NA after calculation no matter if it is TRUE or FALSE.

Value

A filtered tibble.

References

Coefficient of Variation on Wikipedia

Examples

# Example 1: Define reference samples by sample names
toy_metaboscape %>%
  filter_cv(max_cv = 0.2, reference_samples = c("QC1", "QC2", "QC3"))

# Example 2: Define reference samples by group name
toy_metaboscape %>%
  join_metadata(toy_metaboscape_metadata) %>%
  filter_cv(max_cv = 0.2, reference_samples = "QC", ref_as_group = TRUE, group_column = Group)

metamorphr documentation built on June 10, 2026, 5:07 p.m.