aggregate_pep: Aggregate peptide abundances to protein abundances

View source: R/aggregate_pep.R

aggregate_pepR Documentation

Aggregate peptide abundances to protein abundances

Description

Similar to the openMS module ProteinQuantifier, this function provides different methods to aggregate peptide intensities to their parent proteins. It is mainly intended for the use with (raw) Diffacto results, a table of peptide intensities and covariation scores (weights) that can be used to filter peptides before aggregating them up to protein abundances.

Usage

aggregate_pep(
  data,
  sample_cols,
  protein_col,
  peptide_col,
  n_protein_col = NULL,
  split_ambiguous = FALSE,
  split_char = NULL,
  weight_col = NULL,
  weight_threshold = 0.5,
  method = "sum"
)

Arguments

data

the input data frame

sample_cols

(character) columns to be used for peptide aggregation

protein_col

(character) column containing unique protein IDs/names

peptide_col

(character) column containing unique peptide IDs/sequences

n_protein_col

(character) column containing number of proteins annotated for this peptide. THis column indicates ambiguous peptides whose abundance are shared between n proteins.

split_ambiguous

(logical) if those protein groups should be split into individual proteins or not

split_char

(character) character by which to split protein groups

weight_col

(character) the column containing weights or covariance scores

weight_threshold

(numeric) covariance score (weight) cutoff, Diffacto's default is 0.5

method

(character) aggregation method, one of ('sum', 'weightedsum', 'mean', 'weightedmean', 'wgeomean'). The default is 'sum'

Value

a data frame with aggregated protein intensities, one protein at a row

Examples

# load additional dependencies
library(dplyr)
library(tidyr)

# generate data frame
df <- data.frame(
  protein = c("A", "B", "C", "C/D", "C/D/E", "E", "F", "G"),
  n_protein = c(1,1,1,2,3,1,1,1),
  weight = rep(1,8),
  peptide = letters[1:8],
  ab1 = sample(1:100, 8),
  ab2 = sample(1:100, 8),
  ab3 = sample(1:100, 8)
)

aggregate_pep(
  data = df,
  sample_cols = c("ab1", "ab2", "ab3"),
  protein_col = "protein",
  peptide_col = "peptide",
  n_protein_col = "n_protein",
  split_ambiguous = TRUE,
  split_char = "/",
  method = "sum"
)


m-jahn/R-tools documentation built on Feb. 5, 2023, 1:05 p.m.