DIANNtoMSstatsPTMFormat: Convert the output of DIA-NN PSM file into MSstatsPTM format

View source: R/converters.R

DIANNtoMSstatsPTMFormatR Documentation

Convert the output of DIA-NN PSM file into MSstatsPTM format

Description

Takes as input the report.tsv file from DIA-NN and converts it into MSstatsPTM format. Requires PSM and an annotation file. Optionally an additional report.tsv file for a corresponding global profiling run can be included.

Usage

DIANNtoMSstatsPTMFormat(
  input,
  annotation,
  input_protein = NULL,
  annotation_protein = NULL,
  fasta_path = NULL,
  use_unmod_peptides = FALSE,
  protein_id_col = "Protein.Group",
  fasta_protein_name = "uniprot_ac",
  global_qvalue_cutoff = 0.01,
  qvalue_cutoff = 0.01,
  pg_qvalue_cutoff = 0.01,
  useUniquePeptide = TRUE,
  removeFewMeasurements = TRUE,
  removeOxidationMpeptides = TRUE,
  removeProtein_with1Feature = FALSE,
  MBR = TRUE,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL
)

Arguments

input

data.frame of report.tsv file produced by Philosopher

annotation

annotation with Run, Fraction, TechRepMixture, Mixture, Channel, BioReplicate, Condition columns or a path to file. Refer to the example 'annotation' for the meaning of each column.

input_protein

same as input for global profiling run. Default is NULL.

annotation_protein

same as annotation for global profiling run. Default is NULL.

fasta_path

A string of path to a FASTA file, used to match PTM peptides.

use_unmod_peptides

Boolean if the unmodified peptides in the input file should be used to construct the unmodified protein output. Only used if input_protein is not provided. Default is FALSE.

protein_id_col

Use 'Protein.Groups'(default) column for protein name.

fasta_protein_name

Name of column that matches with the protein names in protein_id_col. The protein names in these two columns must match in order to join the FASTA file with the DIA-NN output.

global_qvalue_cutoff

The global qvalue cutoff. Default is 0.01.

qvalue_cutoff

local qvalue cutoff for library. Default is 0.01.

pg_qvalue_cutoff

local qvalue cutoff for protein groups Run should be the same as filename. Default is 0.01.

useUniquePeptide

logical, if TRUE (default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.

removeFewMeasurements

TRUE (default) will remove the features that have 1 or 2 measurements within each Run.

removeOxidationMpeptides

TRUE (default) will remove the peptides including oxidation (M) sequence.

removeProtein_with1Feature

TRUE will remove the proteins which have only 1 peptide and charge. Defaut is FALSE.

MBR

If analaysis was done with match between runs or not. Default is TRUE.

use_log_file

logical. If TRUE, information about data processing will be saved to a file.

append

logical. If TRUE, information about data processing will be added to an existing log file.

verbose

logical. If TRUE, information about data processing wil be printed to the console.

log_file_path

character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If 'append = TRUE', has to be a valid path to a file.

Value

list of one or two data.frame of class MSstatsTMT, named PTM and PROTEIN

Examples

# ptm = read.csv("Phospho/report.tsv", sep="\t")
# protein = read.csv("Protein/report.tsv", sep="\t")
# annotation = read.csv("Phospho/annotation.csv")
# annotation_protein = read.csv("Protein/annotation.csv")

#DIANNtoMSstatsPTMFormat(ptm, annotation, 
#                        protein, annotation_protein,
#                        fasta_path="fasta_file.fasta")


Vitek-Lab/MSstatsPTM documentation built on Sept. 26, 2024, 9:28 p.m.