MetaProfiler: Create MetaProfiler class object from protein-SIP results.

Description Usage Arguments Details Value

Description

Converts the results obtained from protein-SIP experiment into a MetaProfiler class. If using result files, then the extension must be either be a csv or tsv.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
MetaProfiler(
  design,
  data,
  time_unit,
  time_zero = 0,
  isotope = "N",
  peptide_centric = T,
  light_peptide = T,
  as_percentage = T,
  incorporation_name,
  incorporation_columns = "auto",
  intensity_name,
  intensity_columns = "auto",
  labeling_ratio_name = NULL,
  labeling_ratio_columns = "auto",
  labeling_ratio_threshold = NA,
  score_name = NULL,
  higher_score_better = T,
  score_columns = "auto",
  score_threshold = NA,
  peptide_column_PTMs = "guess",
  peptide_column_no_PTMs = "guess",
  accession_column = "guess",
 
    accession_pattern = "[OPQ][0-9][A-Z0-9]{3}[0-9]|[A-NR-Z][0-9]([A-Z][A-Z0-9]{2}[0-9]){1,2}",
  compute_razor_protein = F,
  pep2pro = NULL,
  pep2pro_peptide_column = "guess",
  pep2pro_accession_column = "guess",
  pep2pro_accession_pattern = accession_pattern,
  pep2taxon = NULL,
  pep2taxon_peptide_column = "guess",
  rank_columns = "guess",
  pro2func = NULL,
  pro2func_accession_column = "guess",
  pro2func_accession_pattern = accession_pattern,
  function_columns = "guess",
  annotate_by = c("unmodified", "modified"),
  feature_type_column = NULL,
  feature_type = c("feature", "id"),
  progress = T,
  trace = T,
  ...
)

Arguments

design

A data.frame or data.table containing the experimental design. Each row must correspond to the file paths listed in data or in the file column of table design. Additionally, a timepoint column with the same name as time_unit must be present.

data

Either a data.frame/data.table or a character vector of the list of files containing the result from the protein-SIP experiment. Must be tab or comma delimited. Can be set to NULL if table design contains a column with the filepaths of the result files.

time_unit

The unit for the timepoints.

time_zero

Numeric value denoting the timepoint when the diet was switched. Defaults to zero.

isotope

A character value specifying the stable isotope. Should correspond to one of the elements in the periodic table.

peptide_centric

Logical value specifying if analysis is done at the peptide or protein level.

as_percentage

Should incorporation and labeling ratio values be presented as percentages?

incorporation_name

A character value denoting the name of the incorporation value. If incorporation_columns is set to "auto", then the function will look for columns that start with the character value followed by a unique identifier (i.e. "RIA 1", "RIA 2", "RIA 3", ...). See details for the difference between incorporation and labeling ratio.

incorporation_columns

A character vector detailing the names of the columns containing the incorporation values. Can be set to "auto" if incorporation_name is specified. Can also be set to NULL if no incorporation value was measured.

intensity_name

Silmilar to incorporation_name, but with the intensity values instead.

intensity_columns

Silmilar to incorporation_columns, but with the intensity columns instead. Can be set to NULL if labeling ratio was measured instead.

labeling_ratio_name

Silmilar to incorporation_name, but with the labeling ratio values instead. See details for the difference between incorporation and labeling ratio.

labeling_ratio_columns

Silmilar to incorporation_columns, but with the labeling ratio columns instead. If set to NULL, then labeling ratio is calculated from intensity.

labeling_ratio_threshold

A numeric value that specifies the minimum labeling ratio needed for the heavy peptide or protein to be kept for downstream analysis.

score_name

Silmilar to incorporation_name, but with the heavy peptide identification score values instead.

score_columns

Silmilar to incorporation_columns, but with the heavy peptide identification score columns instead. Can be set to NULL if score was not measured.

score_threshold

A numeric value that specifies the minimum or maximum score needed for the heavy peptide or protein to be kept for downstream analysis.

peptide_column_PTMs

A character value. Specifies the name of the column containing the peptide sequence with post translational modifications (PTMs). If set to "guess", then the function will guess the column based off the headers and the characters contained in the column.

peptide_column_no_PTMs

Similar to peptide_column_PTMs, but instead with peptide sequences without post translational modifications (PTMs). If set to "guess", but the function only detects the column containing PTMs, then the function will add a new column containing the sequences from peptide_column_PTMs, but with modications removed. This also aplies to when the value is set to NULL. When removing PTMs, the function assumes that the name of the modications in the sequences follow UniProt convention.

accession_column

A character value which specifies the name of the column containing the protein accession IDs. Can be set to NULL if pep2pro is provided.

accession_pattern

A string regex. Only used when accession_column is set to "guess". In this case, the function will find columns containing regex strings from accession_pattern. Defaults to standard naming convention patterns for UniProt IDs.

compute_razor_protein

If set to TRUE, then the razor protein for each peptide will be computed. See details for information about razor proteins.

pep2pro

A character value for the filename or a data.frame/data.table with the peptide to protein table. Optional if accession_column is provided.

pep2pro_peptide_column

A character value for the peptide column in pep2pro. By default, the function will guess the columns similarly to peptide_column_PTMs or peptide_column_no_PTMs, depending on annotate_peptide_by.

pep2pro_accession_column

A character value for the protein accessions IDs column in pep2pro. By default, the function will guess the column similarly to how accession_column is guessed.

pep2pro_accession_pattern

See accession_pattern.

pep2taxon

A character value for the filename or a data.frame/data.table with the peptide to taxon table.

pep2taxon_peptide_column

A character value for the peptide column in pep2taxon. By default, the function will guess the columns similarly to peptide_column_PTMs or peptide_column_no_PTMs, depending on annotate_peptide_by.

rank_columns

A character vector for the phylogenetic rank columns in pep2taxon. By default, the function will look for columns with the names 'superkingdom', 'kingdom', 'subkingdom', 'superphylum', 'phylum', 'subphylum', 'superclass', 'class', 'subclass', 'infraclass', 'superorder', 'order', 'suborder', 'infraorder', 'parvorder', 'superfamily', 'family', 'subfamily', 'tribe', 'subtribe', 'genus', 'subgenus', 'species group', 'species subgroup', 'species', 'subspecies', 'varietas', and 'forma'.

pro2func

A character value for the filename or a data.frame/data.table with the protein to function table.

pro2func_accession_column

A character value for the protein accessions IDs column in pro2func. By default, the function will guess the column similarly to how accession_column is guessed.

pro2func_accession_pattern

See accession_pattern.

function_columns

A character vector for the protein function columns in pro2func. By default, the function will guess the columns by looking for columns starting with common functional annotation databases such as KEGG, BRITE, GO, COG, and NOG.

feature_type_column

A character value specifying the name of the column containing the type of feature the heavy peptide was identified/quantified from. If using results from MetaProSIP, the reference used for heavy peptide identification can be either from a feature (i.e. the group of peaks in the retention time and mass to charge ratio dimension belonging to a single peptide entity) or from a pseudo-feature (i.e. the theoretical position of the unlabeled feature using sequence information only).

feature_type

A character vector for the types of feature contained in feature_type_column. Defaults to "feature" and "id" (i.e. the pseudo-feature).

progress

If TRUE, then progress is printed.

trace

If TRUE, then the log is printed.

annotate_by_peptide_with_PTMs

A logical value that specifies whether functional and taxonomic annotation is done using peptide sequences with PTMs, TRUE, or without, FALSE. Only used when peptide_centric is set to TRUE.

Details

# Labeling Ratio labeling_ratio_name denotes the relative ratio between the light peptide and the estimated intensity of the heavy peptide. When using an unlabeled protein-spike in, LR measures the proportion of proteins that are produced using the heavy stable isotope relative to the protein at time_zero. By taking this measure over time, the rate of newly synthesized proteins that incorporate the stable isotope can be estimated. Labeling ratio is calculated using equation:

(IH)/(IL+IH),

where IL is the sum of the intensities of the unlabeled peptides or proteins and IH is the sum of the intensities of the heavy peptides or proteins. # Elemental Flux incorporation_name describes the elemental flux of the isotope, which is measured using the average proportion of the stable isotope incorporated in a peptide of interest. By characterizing the functional and taxonomic origin of the peptide, it gives insight on how and where the stable isotopic substrate is being converted into biomass. Thus, measuring the elemental flux can predict where this substrate is limited. Incorporation is calculated using equation:

(H-M)/(F-M),

where H is the m/z position at the center of the predicted isotopic distribution of the heavy peptide, M is the monoisotopic peak of the light peptide, and F is the m/z position of the fully labeled peptide.

Value

Returns an object of class MetaProfiler.


psmyth94/MetaProfiler documentation built on Nov. 30, 2020, 1 p.m.