MSstatsPreprocess: Preprocess outputs from MS signal processing tools for...

View source: R/MSstatsConvert_core_functions.R

MSstatsPreprocessR Documentation

Preprocess outputs from MS signal processing tools for analysis with MSstats

Description

Preprocess outputs from MS signal processing tools for analysis with MSstats

Usage

MSstatsPreprocess(
  input,
  annotation,
  feature_columns,
  remove_shared_peptides = TRUE,
  remove_single_feature_proteins = TRUE,
  feature_cleaning = list(remove_features_with_few_measurements = TRUE,
    summarize_multiple_psms = max),
  score_filtering = list(),
  exact_filtering = list(),
  pattern_filtering = list(),
  columns_to_fill = list(),
  aggregate_isotopic = FALSE,
  ...
)

Arguments

input

data.table processed by the MSstatsClean function.

annotation

annotation file generated by a signal processing tool.

feature_columns

character vector of names of columns that define spectral features.

remove_shared_peptides

logical, if TRUE shared peptides will be removed.

remove_single_feature_proteins

logical, if TRUE, proteins that only have one feature will be removed.

feature_cleaning

named list with maximum two (for MSstats converters) or three (for MSstatsTMT converter) elements. If handle_few_measurements is set to "remove", feature with less than three measurements will be removed (otherwise it should be equal to "keep"). summarize_multiple_psms is a function that will be used to aggregate multiple feature measurements in a run. It should return a scalar and accept an na.rm parameter. For MSstatsTMT converters, setting remove_psms_with_any_missing will remove features which have missing values in a run from that run.

score_filtering

a list of named lists that specify filtering options. Details are provided in the vignette.

exact_filtering

a list of named lists that specify filtering options. Details are provided in the vignette.

pattern_filtering

a list of named lists that specify filtering options. Details are provided in the vignette.

columns_to_fill

a named list of scalars. If provided, columns with names defined by the names of this list and values corresponding to its elements will be added to the output data.frame.

aggregate_isotopic

logical. If TRUE, isotopic peaks will by summed.

...

additional parameters to data.table::fread.

Value

data.table

Examples

evidence_path = system.file("tinytest/raw_data/MaxQuant/mq_ev.csv", 
                            package = "MSstatsConvert")
pg_path = system.file("tinytest/raw_data/MaxQuant/mq_pg.csv", 
                      package = "MSstatsConvert")
evidence = read.csv(evidence_path)
pg = read.csv(pg_path)
imported = MSstatsImport(list(evidence = evidence, protein_groups = pg),
                         "MSstats", "MaxQuant")
cleaned_data = MSstatsClean(imported, protein_id_col = "Proteins")
annot_path = system.file("tinytest/raw_data/MaxQuant/annotation.csv", 
                         package = "MSstatsConvert")
mq_annot = MSstatsMakeAnnotation(cleaned_data, read.csv(annot_path),
                                 Run = "Rawfile")
                               
# To filter M-peptides and oxidatin peptides 
m_filter = list(col_name = "PeptideSequence", pattern = "M", 
                filter = TRUE, drop_column = FALSE)
oxidation_filter = list(col_name = "Modifications", pattern = "Oxidation", 
                        filter = TRUE, drop_column = TRUE)
msstats_format = MSstatsPreprocess(
cleaned_data, mq_annot, 
feature_columns = c("PeptideSequence", "PrecursorCharge"),
columns_to_fill = list(FragmentIon = NA, ProductCharge = NA),
pattern_filtering = list(oxidation = oxidation_filter, m = m_filter)
)
# Output in the standard MSstats format
head(msstats_format)


Vitek-Lab/MSstatsConvert documentation built on Dec. 17, 2024, 1:14 a.m.