This package provides tools for visualization of results from top-down proteomics studies of prefractionated biological samples and is based on novel visualizations developed for evaluation of the PEPPI-MS prefractionation method. Also suitable for visualization of samples fractionated using GELFrEE or comparison of biological or technical replicates.
A Shiny web application for these tools is also available. See the Shiny app section below for more information.
A novel method for visualization of intersecting sets developed by Lex, Gehlenborg, et al. and implemented using the excellent UpSetR package. Provides improved readability in comparison to Euler and Venn diagrams, especially for visualization of large numbers of sets. The PEPPI-MS paper introduced the use of UpSet plots to show the occurrences and intersections of proteoform identifications across multiple molecular weight-based fractions.
Useful for showing the intersection degrees of proteoform identifications, i.e. the percentage of identifications occurring in one fraction, two fractions, etc.
Used to visualize the distribution of proteoform identifications by molecular weight. Can be made in a vertical orientation for comparison to SDS-PAGE gels:
Used for visualizing quantity and subcellular localization of proteoform identifications by fraction.
Install from GitHub:
remotes::install_github("davidsbutcher/viztools")
Input files for make_UpSet_plot()
and
make_intersection_degree_plot()
should have column names corresponding
to fraction/replicate designations and row values corresponding to
unique protein/proteoform identifiers,
e.g. UniProt accession numbers or
CTDP proteoform record numbers.
An input file for make_heatmap()
should have a column providing
molecular weights and a column providing the fraction/replicate number.
Default column names are “mass” and “fraction” but can be specified in
the function arguments.
An input file for waffle_iron()
should have a column providing the
fraction/replicate number and columns providing subcellular localization
counts. Column names other than “fraction” are used for legend labels,
so I recommend naming them “Cytosol”, “Membrane”, etc.
Example input files for each visualization type can be found in the
extdata
folder in the package directory.
Load an input spreadsheet file as an R object using an appropriate
function, e.g. readxl::read_xlsx()
for XLSX files or
readr::read_csv()
for CSV files. Then, pass the object to the
appropriate visualization function:
# Read an XLSX
df <-
readxl::read_xlsx(
"C:\Users\YourName\Documents\protein_data.xlsx"
)
# Read a CSV
df <-
readr::read_csv(
"C:\Users\YourName\Documents\protein_data.csv"
)
# Use data frame as argument for a visualization function
make_UpSet_plot(df)
Plots created using viztools
can be saved by setting the argument
savePDF = TRUE
:
make_UpSet_plot(df, savePDF = TRUE)
With the exception of UpSet plots, they can also be saved using the
ggplot2::ggsave()
function:
make_heatmap(df)
ggplot2::ggsave(
"heatmap.png",
dpi = 300,
height = 5,
width = 8
)
A GUI web application is currently hosted at shinyapps.io. Input spreadsheet files should be formatted as specified above.
viztools
utilizes the package
UpSetR
for generating UpSet plots
and waffle
for
generating Waffle plots. Other visualizations are generated using
ggplot2
. Additional functions are imported from dplyr
, tibble
,
purrr
, glue
, tidyr
, magrittr
, assertthat
, and scales
.
Package developed by David S. Butcher and licensed under CC BY 4.0. Imported packages are licensed separately.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.