Description Usage Arguments Details Value Author(s) References Examples
A complete workflow for the analysis of AP-MS data, using a two-stage-poisson model and a pre- and postprocessing framework.
1 2 3 4 5 6 7 |
counts |
matrix of spectral counts, proteins in rows and samples in columns. |
baittab |
a character string specifying the pathname of the baittable. see Details. |
norm |
method to normalize the data. If |
Filter |
logical value, whether filtering of the data is applied (Default |
filter.method |
method to use for filtering, must be one of |
var.cutoff |
percentile (between 0 and 1) or |
limit |
minimal number of expected true interaction proteins in the data. |
adj.method |
method to adjust p-values for multiple testing. |
The baittable corresponds to a tab/space delimited file as required for SAINT - consisting of three columns: IP name, bait or control name, indicator for bait and control experiment (T=bait purification, C=control).
Pre-processing comprises normalization and filtering of the data:
Here, it can be chosen from five different normalization methods, adapted from microarray and RNA-seq analysis to AP-MS data. For further details see norm.inttable
.
The filter consists of a biological filter and a statistical variance filter and aims to remove obvious contaminants from further analysis.
If filter.method="noVar"
, only the biological filter is conducted.
Both are conducted, if filter.method="IQR"
, here the variance is calculated by the inter-quartile-range, or if filter.method="overallVar"
, here the variance is calculated across all samples.
The var.cutoff
defines the fraction of proteins with the lowest overall variance, which are considered as contaminants and are removed.
var.cutoff=NA
refers to a cutoff defined by the mean of the shortest intervall containing 50% of the data (default). Alternatively, a quantile can be set as cutoff, e.g. a cutoff of 0.5 filters 50% of the data showing the smallest overall variance or IQR. see also varFilter
The parameter limit
assures, that filtering results in a number of proteins above the number of expected true interaction proteins.
For postprocessing, two different adjustment procedures are provided for multiple testing: the Benjamini-Hochberg procedure ("BH"
) (p-values are controlled by FDR),
and the permutation approach coupled to the Westfall&Young ("WY"
) algorithm (p-values are controlled by FWER).
A list containing the following components:
id |
name of the interaction protein |
log.fold.change |
a vector containing the estimated log fold changes for each protein |
pvalues |
a vector containing the raw p-values for each protein, evaluating the interaction |
padj |
a vector containing the p-values after adjusting for multiple testing using the method of Benjamini-Hochberg |
LRT |
a vector of Likelihood Ratio statistics, scoring the interaction potential of each protein |
dispersion |
a vector of yes/no indicating overdispersion for each protein |
adjusted.p |
a vector containing the adjusted p-values using the permutation-based approach of Westfall&Young |
counter |
a vector containing the number of exceeding permutation scores using the permutation-based approach of Westfall&Young |
matrix1 |
(filtered) (normalized) matrix of spectral counts |
matrix2 |
permutation matrix of scores, permutation runs in columns and proteins in rows |
Martina Fischer
Fischer M, Zilkenat S, Gerlach R, Wagner S, Renard BY. Pre- and Postprocessing for Affinity Purification Mass Spectrometry Data: More Reliable Detection of Interaction Candidates. Journal of Proteome Research 2014.
Auer PL, Doerge RW. A two-stage Poisson model for testing RNA-Seq data. Statistical Applications in Genetics and Molecular Biology 2011.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | # input data
intfile <- system.file("extdata", "inttable.txt", package="apmsWAPP")
counts <- int2mat(read.table(intfile))
baitfile <- system.file("extdata", "baittab.txt", package="apmsWAPP")
# TSPM with quantile normalization and filtering
tspm.quaF <- tspm_apms( counts, baitfile,
norm="quantile", Filter=TRUE,
filter.method="overallVar",
var.cutoff=0.1, adj.method="WY")
# Results:
# for adjustment with BH:
cat("Number of Proteins with p-value <0.05: ",
length(which(tspm.quaF[[1]]$padj < 0.05) ) )
# for adjustment with WY:
cat("Number of Proteins with p-value <0.05: ",
length(which(tspm.quaF[[2]][,2] <0.05)))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.