SWATH2stats example script

Example R code showing the usage of the SWATH2stats package. The data processed is the publicly available dataset of S.pyogenes (Röst et al. 2014) (http://www.peptideatlas.org/PASS/PASS00289). The results file 'rawOpenSwathResults_1pcnt_only.tsv' can be found on PeptideAtlas (ftp://PASS00289@ftp.peptideatlas.org/../Spyogenes/results/). This is a R Markdown file, showing the result of processing this data. The lines shaded in grey represent the R code executed during this analysis.

The SWATH2stats package can be directly installed from Bioconductor using the commands below (http://bioconductor.org/packages/devel/bioc/html/SWATH2stats.html).

if (!require("BiocManager"))
    install.packages("BiocManager")
BiocManager::install("SWATH2stats")

Part 1: Loading and annotation

Load the SWATH-MS example data from the package, this is a reduced file in order to limit the file size of the package.

library(SWATH2stats)
library(data.table)
data('Spyogenes', package = 'SWATH2stats')

Alternatively the original file downloaded from the Peptide Atlas can be loaded from the working directory.

data <- data.frame(fread('rawOpenSwathResults_1pcnt_only.tsv', sep='\t', header=TRUE))

Extract the study design information from the file names. Alternatively, the study design table can be provided as an external table.

Study_design <- data.frame(Filename = unique(data$align_origfilename))
Study_design$Filename <- gsub('.*strep_align/(.*)_all_peakgroups.*', '\\1', Study_design$Filename)
Study_design$Condition <- gsub('(Strep.*)_Repl.*', '\\1', Study_design$Filename)
Study_design$BioReplicate <- gsub('.*Repl([[:digit:]])_.*', '\\1', Study_design$Filename)
Study_design$Run <- seq_len(nrow(Study_design))
head(Study_design)

The SWATH-MS data is annotated using the study design table.

data.annotated <- sample_annotation(data, Study_design, column_file = "align_origfilename")

Remove the decoy peptides for a subsequent inspection of the data.

data.annotated.nodecoy <- subset(data.annotated, decoy==FALSE)

\newpage

Part 2: Analyze correlation, variation and signal

Count the different analytes for the different injections.

count_analytes(data.annotated.nodecoy)

Plot the correlation of the signal intensity.

correlation <- plot_correlation_between_samples(data.annotated.nodecoy, column.values = 'Intensity')

Plot the correlation of the delta_rt, which is the deviation of the retention time from the expected retention time.

correlation <- plot_correlation_between_samples(data.annotated.nodecoy, column.values = 'delta_rt')

\newpage

Plot the variation of the signal across replicates.

variation <- plot_variation(data.annotated.nodecoy)
variation[[2]]

Plot the total variation versus variation within replicates.

variation_total <- plot_variation_vs_total(data.annotated.nodecoy)
variation_total[[2]]

Calculate the summed signal per peptide and protein across samples.

peptide_signal <- write_matrix_peptides(data.annotated.nodecoy)
protein_signal <- write_matrix_proteins(data.annotated.nodecoy)
head(protein_signal)

\newpage

Part 3: FDR estimation

Estimate the overall FDR across runs using a target decoy strategy.

par(mfrow = c(1, 3))
fdr_target_decoy <- assess_fdr_overall(data.annotated, n_range = 10, 
                                       FFT = 0.25, output = 'Rconsole')

According to this FDR estimation one would need to filter the data with a lower mscore threshold to reach an overall protein FDR of 5%.

mscore4protfdr(data, FFT = 0.25, fdr_target = 0.05)

Part 4: Filtering

Filter data for values that pass the 0.001 mscore criteria in at least two replicates of one condition.

data.filtered <- filter_mscore_condition(data.annotated, 0.001, n_replica = 2)

Select only the 10 peptides showing strongest signal per protein.

data.filtered2 <- filter_on_max_peptides(data.filtered, n_peptides = 10)

\newpage

Filter for proteins that are supported by at least two peptides.

data.filtered3 <- filter_on_min_peptides(data.filtered2, n_peptides = 2)

Part 5: Conversion

Convert the data into a transition-level format (one row per transition measured).

data.transition <- disaggregate(data.filtered3)

Convert the data into the format required by MSstats.

MSstats.input <- convert4MSstats(data.transition)
head(MSstats.input)

Convert the data into the format required by mapDIA.

mapDIA.input <- convert4mapDIA(data.transition)
head(mapDIA.input)

Convert the data into the format required by aLFQ.

aLFQ.input <- convert4aLFQ(data.transition)
head(aLFQ.input)

Session info on the R version and packages used.

sessionInfo()


Try the SWATH2stats package in your browser

Any scripts or data that you put into this service are public.

SWATH2stats documentation built on April 17, 2021, 6:01 p.m.