mergePep: Merge peptide table(s) into one
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

mergePep

R Documentation

Merge peptide table(s) into one

Description

mergePep merges individual peptide table(s), TMTset1_LCMSinj1_Peptide_N.txt, TMTset1_LCMSinj2_Peptide_N.txt etc., into one interim Peptide.txt. The log2FC values in the interim result are centered with the medians at zero (median centering). The utility is typically applied after the conversion of PSMs to peptides via PSM2Pep and is required even with a experiment at one multiplex TMT and one LC/MS series.

Usage

mergePep(
  use_duppeps = TRUE,
  mbr_ret_tol = NULL,
  duppeps_repair = c("majority", "denovo"),
  plot_log2FC_cv = TRUE,
  cut_points = Inf,
  rm_allna = FALSE,
  imp_refs = FALSE,
  omit_single_lfq = FALSE,
  ret_sd_tol = Inf,
  rm_ret_outliers = FALSE,
  ...
)

Arguments

`use_duppeps`	Logical; if TRUE, re-assigns double/multiple dipping peptide sequences to the most likely proteins by majority votes.
`mbr_ret_tol`	The tolerance in MBR retention time in seconds. The default is to match the setting in normPSM.
`duppeps_repair`	Not currently used (or only with `majority`). Character string; the method of reparing double-dipping peptide sequences upon data pooling. For instance, the same sequence of PEPTIDE may be assigned to protein accession PROT_ACC1 in data set 1 and PROT_ACC2 in data set 2. At the `denovo` default, the peptide to protein association will be re-established freshly. At the `majority` alternative, a majority rule will be applied for the re-assignments.
`plot_log2FC_cv`	Logical; if TRUE, the distributions of the CV of peptide `log2FC` will be plotted. The default is TRUE.
`cut_points`	A named, numeric vector defines the cut points (knots) for the median-centering of `log2FC` by sections. For example, at `cut_points = c(mean_lint = seq(4, 7, .5))`, `log2FC` will be binned according to the intervals of `-Inf, 4, 4.5, ..., 7, Inf` under column `mean_lint` (mean log10 intensity) in the input data. The default is `cut_points = Inf`, or equivalently `-Inf`, where the `log2FC` under each sample will be median-centered as one piece. See also `prnHist` for data binning in histogram visualization.
`rm_allna`	Logical; if TRUE, removes data rows that are exclusively NA across ratio columns of `log2_R126` etc. The setting also applies to `log2_R000` in LFQ.
`imp_refs`	Logical; impute missing references or not.
`omit_single_lfq`	Depreciated. Logical; if TRUE, omits LFQ entries with single measured values across all samples. The default is FALSE.
`ret_sd_tol`	Depreciated. Numeric; the tolerance in the variance of retention time (w.r.t. measures in seconds). The thresholding applies to TMT data. The default is `Inf`. Depends on the setting of LCMS gradients, a setting of, e.g., 150 might be suitable.
`rm_ret_outliers`	Depreciated. Logical; if TRUE, removes peptide entries with outlying retention times across samples and/or LCMS series.
`...`	`filter_`: Variable argument statements for the row filtration of data against the column keys in individual peptide tables of `TMTset1_LCMSinj1_Peptide_N.txt, TMTset1_LCMSinj2_Peptide_N.txt`, etc. The variable argument statements should be in the following format: each statement contains to a list of logical expression(s). The `lhs` needs to start with `filter_`. The logical condition(s) at the `rhs` needs to be enclosed in `exprs` with round parenthesis. For example, `pep_len` is a column key present in `Mascot` peptide tables of `TMTset1_LCMSinj1_Peptide_N.txt`, `TMTset1_LCMSinj2_Peptide_N.txt` etc. The statement `filter_peps_at = exprs(pep_len <= 50)` will remove peptide entries with `pep_len > 50`. See also `normPSM`.

Details

In the interim output file, "Peptide.txt", values under columns log2_R... are logarithmic ratios at base 2 in relative to the reference(s) within each multiplex TMT set, or to the row means within each plex if no reference(s) are present. Values under columns N_log2_R... are median-centered log2_R... without scaling normalization. Values under columns Z_log2_R... are N_log2_R... with additional scaling normalization. Values under columns I... are reporter-ion or LFQ intensity before normalization. Values under columns N_I... are normalized I.... Values under columns sd_log2_R... are the standard deviation of the log2FC of proteins from ascribing peptides.

Description of the column keys in the output:
system.file("extdata", "peptide_keys.txt", package = "proteoQ")

The peptide counts in individual peptide tables, TMTset1_LCMSinj1_Peptide_N.txt etc., may be fewer than the entries indicated under the prot_n_pep column after the peptide removals/cleanups using purgePSM.

Value

The primary output is in .../Peptide/Peptide.txt.

Data normalization
normPSM for extended examples in PSM data normalization
PSM2Pep for extended examples in PSM to peptide summarization
mergePep for extended examples in peptide data merging
standPep for extended examples in peptide data normalization
Pep2Prn for extended examples in peptide to protein summarization
standPrn for extended examples in protein data normalization.
purgePSM and purgePep for extended examples in data purging
pepHist and prnHist for extended examples in histogram visualization.
extract_raws and extract_psm_raws for extracting MS file names

Variable arguments of 'filter_...'
contain_str, contain_chars_in, not_contain_str, not_contain_chars_in, start_with_str, end_with_str, start_with_chars_in and ends_with_chars_in for data subsetting by character strings

Missing values
pepImp and prnImp for missing value imputation

Informatics
pepSig and prnSig for significance tests
pepVol and prnVol for volcano plot visualization
prnGSPA for gene set enrichment analysis by protein significance pVals
gspaMap for mapping GSPA to volcano plot visualization
prnGSPAHM for heat map and network visualization of GSPA results
prnGSVA for gene set variance analysis
prnGSEA for data preparation for online GSEA.
pepMDS and prnMDS for MDS visualization
pepPCA and prnPCA for PCA visualization
pepLDA and prnLDA for LDA visualization
pepHM and prnHM for heat map visualization
pepCorr_logFC, prnCorr_logFC, pepCorr_logInt and prnCorr_logInt for correlation plots
anal_prnTrend and plot_prnTrend for trend analysis and visualization
anal_pepNMF, anal_prnNMF, plot_pepNMFCon, plot_prnNMFCon, plot_pepNMFCoef, plot_prnNMFCoef and plot_metaNMF for NMF analysis and visualization

Custom databases
Uni2Entrez for lookups between UniProt accessions and Entrez IDs
Ref2Entrez for lookups among RefSeq accessions, gene names and Entrez IDs
prepGO for gene ontology
prepMSig for molecular signatures
prepString and anal_prnString for STRING-DB

Column keys in PSM, peptide and protein outputs
system.file("extdata", "psm_keys.txt", package = "proteoQ")
system.file("extdata", "peptide_keys.txt", package = "proteoQ")
system.file("extdata", "protein_keys.txt", package = "proteoQ")

Examples


# ===================================
# Merge peptide data
# ===================================

## !!!require the brief working example in `?load_expts`

# everything included
mergePep()

# row filtrations against column keys in `TMTset1_LCMSinj1_Peptide_N.txt`...
mergePep(
  filter_peps_by_sp = exprs(species == "human", pep_len <= 50),
)

# alignment of data by segments
mergePep(cut_points = c(mean_lint = seq(4, 7, .5)))

# alignment of data by empirical protein abundance
# `10^prot_icover - 1` comparable to emPAI
mergePep(cut_points = c(prot_icover = seq(0, 1, .25)))

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

mergePep: Merge peptide table(s) into one
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Merge peptide table(s) into one

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to mergePep in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

mergePep: Merge peptide table(s) into one In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Merge peptide table(s) into one

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to mergePep in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

mergePep: Merge peptide table(s) into one
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data