specL automatic report

Requirements

In a first step, the peptide identification result is generated by a standard shotgun proteomics experiment and has to be processed using the bibliospec software. [@pmid18428681].

For generating the ion library the r Biocpkg('specL') is used. The workflow is described in [@pmid25712692].

The following R packages has to be installed on the compute box.

library(specL)

This file can be rendered by useing the following code snippet.

library(rmarkdown)
library(BiocStyle)
report_file <- tempfile(fileext='.Rmd'); 
file.copy(system.file("doc", "report.Rmd", 
                      package = "specL"), 
          report_file); 
rmarkdown::render(report_file, 
                  output_format='html_document', 
                  output_file='/tmp/report_specL.html')

Input

Parameter

If no INPUT is defined the report uses the r Biocpkg("specL") package's data and the following default parameters.

if(!exists("INPUT")){
  INPUT <- list(FASTA_FILE 
      = system.file("extdata", "SP201602-specL.fasta.gz",
                    package = "specL"),
    BLIB_FILTERED_FILE 
      = system.file("extdata", "peptideStd.sqlite",
                    package = "specL"),
    BLIB_REDUNDANT_FILE 
      = system.file("extdata", "peptideStd_redundant.sqlite",
                    package = "specL"),
    MIN_IONS = 5,
    MAX_IONS = 6,
    MZ_ERROR = 0.05,
    MASCOTSCORECUTOFF = 17,
    FRAGMENTIONMZRANGE = c(300, 1250),
    FRAGMENTIONRANGE = c(5, 200),
    NORMRTPEPTIDES = specL::iRTpeptides,
    OUTPUT_LIBRARY_FILE = tempfile(fileext ='.csv'),
    RDATA_LIBRARY_FILE = tempfile(fileext ='.RData'),
    ANNOTATE = TRUE
    )
} 

The library generation workflow was performed using the following parameters:

  cat(
  " MASCOTSCORECUTOFF = ", INPUT$MASCOTSCORECUTOFF, "\n",
  " BLIB_FILTERED_FILE = ", INPUT$BLIB_FILTERED_FILE, "\n",
  " BLIB_REDUNDANT_FILE = ", INPUT$BLIB_REDUNDANT_FILE, "\n",
  " MZ_ERROR = ", INPUT$MZ_ERROR, "\n",
  " FRAGMENTIONMZRANGE = ", INPUT$FRAGMENTIONMZRANGE, "\n",
  " FRAGMENTIONRANGE = ", INPUT$FRAGMENTIONRANGE, "\n",
  " FASTA_FILE = ", INPUT$FASTA_FILE, "\n",
  " MAX_IONS = ", INPUT$MAX_IONS, "\n",
  " MIN_IONS = ", INPUT$MIN_IONS, "\n"
  )
library(knitr)
# kable(t(as.data.frame(INPUT)))
ii <- ((lapply(INPUT, function(x){ if(typeof(x) %in% c("character", "double")){paste(x, collapse = ', ')}else{NULL} } )))


parameter <- as.data.frame(unlist(ii))
names(parameter) <- 'parameter.values'
kable(parameter, caption = 'used INPUT parameter')

Define the fragment ions of interest

The following R helper function is used for composing the in-silico fragment ions using r CRANpkg("protViz").

fragmentIonFunction_specL <- function (b, y) {
  Hydrogen <- 1.007825
  Oxygen <- 15.994915
  Nitrogen <- 14.003074
  b1_ <- (b )
  y1_ <- (y )
  b2_ <- (b + Hydrogen) / 2
  y2_ <- (y + Hydrogen) / 2 
  return( cbind(b1_, y1_, b2_, y2_) )
}

Read the sqlite files

BLIB_FILTERED <- read.bibliospec(INPUT$BLIB_FILTERED_FILE) 

summary(BLIB_FILTERED)
BLIB_REDUNDANT <- read.bibliospec(INPUT$BLIB_REDUNDANT_FILE) 
summary(BLIB_REDUNDANT)

Protein (re)-annotation

After processing the psm using bibliospec the protein information is gone.

The read.fasta function is provided by the CRAN package r CRANpkg("seqinr").

if(INPUT$ANNOTATE){
  FASTA <- read.fasta(INPUT$FASTA_FILE, 
                    seqtype = "AA", 
                    as.string = TRUE)

  BLIB_FILTERED <- annotate.protein_id(BLIB_FILTERED, 
                                       fasta = FASTA)
}

Peptides used for RT normalization

The following peptides are used for the RT normalization. The last column indicates by FALSE/TRUE if a peptides is included in the data. The rows were ordered by the RT values.

library(knitr)
incl <-  INPUT$NORMRTPEPTIDES$peptide %in% sapply(BLIB_REDUNDANT, function(x){x$peptideSequence})
INPUT$NORMRTPEPTIDES$included <- incl

if (sum(incl) > 0){
  res <- INPUT$NORMRTPEPTIDES[order(INPUT$NORMRTPEPTIDES$rt),]
  # row.names(res) <- 1:nrow(res)
  kable(res, caption='peptides used for RT normaization.')
}

Generate the ion library

specLibrary <- specL::genSwathIonLib(
  data = BLIB_FILTERED,
  data.fit = BLIB_REDUNDANT,
  max.mZ.Da.error = INPUT$MZ_ERROR,
  topN = INPUT$MAX_IONS,
  fragmentIonMzRange = INPUT$FRAGMENTIONMZRANGE,
  fragmentIonRange = INPUT$FRAGMENTIONRANGE,
  fragmentIonFUN = fragmentIonFunction_specL,
  mascotIonScoreCutOFF = INPUT$MASCOTSCORECUTOFF,
  iRT = INPUT$NORMRTPEPTIDES
  )

Library Generation Summary

Total Number of PSM's with Mascot e-value < 0.05, in your search is r length(BLIB_REDUNDANT). The number of unique precurosors is r length(BLIB_FILTERED). The size of the generated ion library is r length(specLibrary@ionlibrary). That means that r round(length(specLibrary@ionlibrary)/length(BLIB_FILTERED) * 100, 2) % of the unique precursors fullfilled the filtering criteria.

summary(specLibrary)

In the following two code snippets the first element of the ion library is displayed:

#  slotNames(specLibrary@ionlibrary[[1]])
specLibrary@ionlibrary[[1]]
plot(specLibrary@ionlibrary[[1]])
plot(specLibrary)

plots an overview of the whole ion library. Please note, that the iRT peptides used for the normalization of RT do not have to be included in the resulting \code{specLibrary}.

Output

write.spectronaut(specLibrary, file =  INPUT$OUTPUT_LIBRARY_FILE)
save(specLibrary, file = INPUT$RDATA_LIBRARY_FILE)

saves the result object to a file.

Remarks

For questions and improvements please do contact the authors of the specL.

This report Rmarkdown file has been written by WEW and is maintained by CP.

Session info

Here is the output of sessionInfo() on the system on which this document was compiled:

sessionInfo()

References



Try the specL package in your browser

Any scripts or data that you put into this service are public.

specL documentation built on Nov. 8, 2020, 7:55 p.m.