ume_filter_formulas: Complete Formula subsetting / filtering (wrapper)

View source: R/ume_utilities.R

ume_filter_formulasR Documentation

Complete Formula subsetting / filtering (wrapper)

Description

A wrapper function to filter molecular formulas according to a evaluation parameters.

Usage

ume_filter_formulas(mfd, verbose = FALSE, ...)

Arguments

mfd

data.table with molecular formula data as derived from ume::assign_formulas. Column names of elements/isotopes must match names in the isotope column of ume::masses; values are integers representing counts per formula.

verbose

logical; if TRUE, show progress messages.

...

Arguments passed on to filter_mf_data, subset_known_mf, calc_norm_int, filter_int, remove_blanks

c_iso_check

(TRUE / FALSE); check if formulas are verified by the presence of the main daughter isotope

n_iso_check

(TRUE / FALSE); check if formulas are verified by the presence of the main daughter isotope

s_iso_check

(TRUE / FALSE); check if formulas are verified by the presence of the main daughter isotope

ma_dev

Deviation range of mass accuracy in +/- ppm (default: 3 ppm)

dbe_max

Maximum number for DBE

dbe_o_min

Minimum number for DBE minus O atoms

dbe_o_max

Maximum number for DBE minus O atoms

mz_min

Minimum of mass to charge value

mz_max

Maximum of mass to charge value

n_min

Minimum number of nitrogen atoms

n_max

Maximum number of nitrogen atoms

s_min

Minimum number of nitrogen atoms

s_max

Maximum number of nitrogen atoms

p_min

Minimum number of nitrogen atoms

p_max

Maximum number of nitrogen atoms

oc_min

Minimum atomic ratio of oxygen / carbon

oc_max

Maximum atomic ratio of oxygen / carbon

hc_min

Minimum atomic ratio of hydrogen / carbon

hc_max

Maximum atomic ratio of hydrogen / carbon

nc_min

Minimum atomic ratio of nitrogen / carbon

nc_max

Maximum atomic ratio of nitrogen / carbon

select_category

List of category names that should be selected

exclude_category

List of category names that should be ignored

ms_id

Character; name of the column identifying individual spectra (default: "file_id").

peak_id

Character; name of the column identifying unique peaks (default: "peak_id").

peak_magnitude

Character; name of the column containing peak intensity values (default: "i_magnitude").

normalization

Character; normalization method to apply. One of "bp", "sum", "sum_ubiq", "sum_rank", "none". Default is "bp".

n_rank

Integer; number of top-ranked peaks to use for "sum_rank" normalization (default: 200).

norm_int_min

Lower threshold (>=) of (normalized) peak magnitude

norm_int_max

Upper threshold (<=) of (normalized) peak magnitude

blank_file_ids

Integer vector of file_id values that represent blank analyses.

blank_prevalence

Numeric between 0 and 1. Threshold for blank filtering: the proportion of blanks in which a molecular formula must occur before it is excluded from the sample data. For example, blank_prevalence = 0 (default) removes any formula detected in at least one blank, while blank_prevalence = 0.5 removes formulas detected in 50% or more of the blanks.

ret_time_col

Character scalar. Name of the retention-time column that contains the beginning of the retention time segment that corresponds to the mass spectrum. If NULL (default), the function will auto-detect the first column in c("ret_time_min","retention_time","rt","RT") that exists in mfd. If none is found, blanks are removed ignoring retention time.

Value

A data.table having molecular formula assignments for each mass. ume_filter_formulas(mfd = mf_data_demo, dbe_o_max = 15, norm_int_min = 2)

See Also

Other Formula subsetting: filter_int(), filter_mass_accuracy(), filter_mf_data(), remove_blanks(), subset_known_mf(), ume_assign_formulas()

Other ume wrapper: ume_assign_formulas()


ume documentation built on Dec. 13, 2025, 1:06 a.m.