filter_dataset: Filter dataset

filter_datasetR Documentation

Filter dataset

Description

For each of the filters (by group, all groups, by contrast) an extra column will be appended to the dataset$peptides table that contains the intensity data for all peptides that pass these filters (⁠name: intensity_<filter>⁠). Optionally, you can apply normalization (recommended).

Usage

filter_dataset(
  dataset,
  filter_min_detect = 1,
  filter_fraction_detect = 0,
  filter_min_quant = 0,
  filter_fraction_quant = 0,
  filter_min_peptide_per_prot = 1,
  filter_topn_peptides = 0,
  norm_algorithm = "",
  rollup_algorithm = "maxlfq",
  by_group = FALSE,
  all_group = FALSE,
  by_contrast = FALSE
)

Arguments

dataset

a valid dataset object generated upstream by, for instance, import_dataset_skyline

filter_min_detect

in order for a peptide to 'pass' in a sample group, in how many replicates must it be detected?

filter_fraction_detect

in order for a peptide to 'pass' in a sample group, what fraction of replicates must it be detected?

filter_min_quant

in order for a peptide to 'pass' in a sample group, in how many replicates must it be quantified?

filter_fraction_quant

in order for a peptide to 'pass' in a sample group, what fraction of replicates must it be quantified?

filter_min_peptide_per_prot

in order for a peptide to 'pass' in a sample group, how many peptides should be available after detect filters?

filter_topn_peptides

maximum number of peptides to maintain for each protein (from the subset that passes above filters, peptides are ranked by the number of samples where detected and their variation between replicates). 1 is default, 2 can be a good choice situationally. If set to 1, make sure to inspect individual peptide data/plots for proteins with 1 peptide.

norm_algorithm

normalization algorithms. options; "", "vsn", "loess", "rlr", "msempire", "vwmb", "modebetween". Provide an array of options to run each algorithm consecutively

rollup_algorithm

the algorithm for combining peptides to proteins as used in normalization algorithms that require a priori rollup from peptides to a protein-level abundance matrix (e.g. modebetween_protein). Refer to rollup_pep2prot function documentation for available options and a brief description of each

by_group

within each sample group, apply the filter. All peptides that fail the filter in group g will have intensity value NA in the intensity_by_group column for the samples in the respective group

all_group

in every sample group, apply the filter. All peptides that fail the filter in any group will have intensity value NA in the intensity_all_groups column for all samples

by_contrast

should the above filters be applied to all sample groups, or only those tested within each contrast? Enabling this optimizes available data in each contrast, but increases the complexity somewhat as different subsets of peptides are used in each contrast and normalization is applied separately

Details

Note; this is built-in for analysis_quickstart so if you use that all-in-one function you don't have to perform all pipeline steps manually


ftwkoopmans/msdap documentation built on March 5, 2025, 12:15 a.m.