quality_flags: Add Quality Flags

add_quality_flagsR Documentation

Add Quality Flags

Description

[Stable]

The function add_quality_flags() adds quality flag information to a AnyHermesData object:

  • low_expression_flag: for each gene, counts how many samples don't pass a minimum expression Counts per Million (CPM) threshold. If too many, then it flags this gene as a "low expression" gene.

  • tech_failure_flag: first calculates the Pearson correlation matrix of the sample wise CPM values, resulting in a matrix measuring the correlation between samples. Then compares the average correlation per sample with a threshold - if it is too low, then the sample is flagged as a "technical failure" sample.

  • low_depth_flag: computes the library size (total number of counts) per sample. If this number is too low, the sample is flagged as a "low depth" sample.

Separate helper functions are internally used to create the flags, and separate getter functions allow easy access to the quality control flags in an object.

Usage

add_quality_flags(object, control = control_quality(), overwrite = FALSE)

h_low_expression_flag(object, control = control_quality())

h_low_depth_flag(object, control = control_quality())

h_tech_failure_flag(object, control = control_quality())

get_tech_failure(object)

get_low_depth(object)

get_low_expression(object)

Arguments

object

(AnyHermesData)
input.

control

(list)
list of settings (thresholds etc.) used to compute the quality control flags, produced by control_quality().

overwrite

(flag)
whether previously added flags may be overwritten.

Details

While object already has the variables mentioned above as part of the rowData and colData (as this is enforced by the validation method for AnyHermesData), they are usually still NA after the initial object creation.

Value

The input object with added quality flags.

Functions

  • h_low_expression_flag(): creates the low expression flag for genes given control settings.

  • h_low_depth_flag(): creates the low depth (library size) flag for samples given control settings.

  • h_tech_failure_flag(): creates the technical failure flag for samples given control settings.

  • get_tech_failure(): get the technical failure flags for all samples.

  • get_low_depth(): get the low depth failure flags for all samples.

  • get_low_expression(): get the low expression failure flags for all genes.

See Also

  • control_quality() for the detailed settings specifications;

  • set_tech_failure() to manually flag samples as technical failures.

Examples

# Adding default quality flags to `AnyHermesData` object.
object <- hermes_data
result <- add_quality_flags(object)
which(get_tech_failure(result) != get_tech_failure(object))
head(get_low_expression(result))
head(get_tech_failure(result))
head(get_low_depth(result))

# It is possible to overwrite flags if needed, which will trigger a message.
result2 <- add_quality_flags(result, control_quality(min_cpm = 1000), overwrite = TRUE)

# Separate calculation of low expression flag.
low_expr_flag <- h_low_expression_flag(
  object,
  control_quality(min_cpm = 500, min_cpm_prop = 0.9)
)
length(low_expr_flag) == nrow(object)
head(low_expr_flag)

# Separate calculation of low depth flag.
low_depth_flag <- h_low_depth_flag(object, control_quality(min_depth = 5))
length(low_depth_flag) == ncol(object)
head(low_depth_flag)

# Separate calculation of technical failure flag.
tech_failure_flag <- h_tech_failure_flag(object, control_quality(min_corr = 0.35))
length(tech_failure_flag) == ncol(object)
head(tech_failure_flag)
head(get_tech_failure(object))
head(get_low_depth(object))
head(get_low_expression(object))

insightsengineering/hermes documentation built on Sept. 19, 2024, 9:06 p.m.