Introduction to Rtpca

knitr::opts_chunk$set(echo = TRUE)

Installation

Installation from Bioconductor

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("Rtpca")
  1. Load the package into your R session.
library(Rtpca)

Introduction

Thermal proteome profiling (TPP) [@Savitski2014; @Mateus2020] is a mass spectrometry-based, proteome-wide implemention of the cellular thermal shift assay [@Molina2013]. It was originally developed to study drug-(off-)target engagement. However, it was realized that profiles of interacting protein pairs appeared more similar than by chance which was coined as 'thermal proximity co-aggregation' (TPCA) [@Tan2018]. The R package Rtpca enables analysis of TPP datasets using the TPCA concept for studying protein-protein interactions and protein complexes and also allows to test for differential protein-protein interactions across different conditions.

This vignette only represents a minimal example. To have a look at a more realistic example feel free to check out this more realisticexample.

Note: if you use Rtpca in published research, please cite:

Kurzawa, N., Mateus, A. & Savitski, M.M. (2020) Rtpca: an R package for differential thermal proximity coaggregation analysis. Bioinformatics, 10.1093/bioinformatics/btaa682

The Rtpca package workflow

We also load the TPP package to illustrate how to import TPP data with the Bioconductor package and then input it into the Rtpca functions.

library(TPP)

Import Thermal proteome profiling data using the TPP package

We load the data hdacTR_smallExample which is part of the TPP package

data("hdacTR_smallExample")

Filter hdacTR_data to speed up computations

set.seed(123)
random_proteins <- sample(hdacTR_data[[1]]$gene_name, 300)
hdacTR_data_fil <- lapply(hdacTR_data, function(temp_df){
    filter(temp_df, gene_name %in% random_proteins)
})

We can now import our small example dataset using the import function from the TPP package:

trData <- tpptrImport(configTable = hdacTR_config, data = hdacTR_data_fil)

Performing thermal co-aggregation analysis with Rtpca

Then, we load string_ppi_df which is a data frame that annotates protein-protein interactions as obtained from StringDB [@Szklarczyk2019] that comes with the Rtpca package

data("string_ppi_df")
string_ppi_df

This table has been created from the human protein.links table downloaded from the StringDB website. It can serve as a template for users to create equivalent tables for other organisms.

Run TPCA on data from a single condition

We can run TPCA for protein-protein interactions like this by using the function runTPCA

string_ppi_cs_950_df <- string_ppi_df %>% 
    filter(combined_score >= 950 )

vehTPCA <- runTPCA(
    objList = trData,
    ppiAnno = string_ppi_cs_950_df
)

Note: it is not necessary that your data has the format of the TPP package (ExpressionSet), you can also supply the function with a list of matrices of data frames (in the case of data frames you need to additionally indicate with column contains the protein or gene names).

We can also run TPCA to test for coaggregation of protein complexes. For this purpose with can load a data frame that annotates proteins to protein complexes curated by @Ori2016

data("ori_et_al_complexes_df")
ori_et_al_complexes_df

Then, we can invoke

vehComplexTPCA <- runTPCA(
    objList = trData,
    complexAnno = ori_et_al_complexes_df,
    minCount = 2
)

We can plot a ROC curve for how well our data captures protein-protein interactions:

plotPPiRoc(vehTPCA, computeAUC = TRUE)

And we can also plot a ROC curve for how well our data captures protein complexes:

plotComplexRoc(vehComplexTPCA, computeAUC = TRUE)

Run differential TPCA on two conditions

In order to test for protein-protein interactions that change significantly between both conditions, we can run the runDiffTPCA as illustrated below:

diffTPCA <- 
    runDiffTPCA(
        objList = trData[1:2], 
        contrastList = trData[3:4],
        ctrlCondName = "DMSO",
        contrastCondName = "Panobinostat",
        ppiAnno = string_ppi_cs_950_df)

We can then plot a volcano plot to visualize the results:

plotDiffTpcaVolcano(
    diffTPCA,
    setXLim = TRUE,
    xlimit = c(-0.5, 0.5))

The underlying result table can be inspected like this;

head(diffTpcaResultTable(diffTPCA) %>% 
         arrange(p_value) %>% 
        dplyr::select(pair, rssC1_rssC2, f_stat, p_value, p_adj))

We can see that none of these interactions is significant consiering the multiple comparison we have done. Yet, we can look at the melting curves of pairs like the "KPNA6:KPNB1" by evoking:

plotPPiProfiles(diffTPCA, pair = c("KPNA6", "KPNB1"))

We can see that both protein do seem to coaggregate, but that the mild difference in the treatment condition compared to the control condition is likely due to technical rather than biological reasons.
This way of inspecting hits obtained by the differential analysis is recommended in the case that significant pairs can be found to validate that they do coaggregate in one condition and that the less strong coaggregations in the other condition is based on reliable signal.

Additional remarks

As mentioned above, this vignette includes only a very minimal example, have a look at a more extensive example here.

sessionInfo()

References



Try the Rtpca package in your browser

Any scripts or data that you put into this service are public.

Rtpca documentation built on Nov. 8, 2020, 7:44 p.m.