knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette provides a walthrough of using epitopefindr
to process an
example dataset included with the package, epitopefindr::pairwise_viral_hits
.
The dataset has been previously explored by Monaco et al. Biorxiv and describes a set of phage-displayed peptide sequences for which the patient's antibody specificity demonstrated a statistically significant change when comparing the patient's data from the PhIP-seq assay.
epitopefindr
may be useful to look for sequence similarity among these enriched peptides as a step towards identifying critical sequence motifs.
We can see that pairwise_viral_hits
is a Biostrings::AAStringSet object consisting of the 159 pairwise enriched peptides. These peptides have been annotated with taxonomic and protein name information. This is our input to epitopefindr
.
library(epitopefindr) library(magrittr) epitopefindr::pairwise_viral_hits
epitopefindr::pairwise_viral_hits names(epitopefindr::pairwise_viral_hits)[1]
epitopefindr
The all-in-one wrapper for running epitopefindr
is epfind
. At its most basic,
epfind
takes two input parameters, (1) a set of named peptides, either as an AAStringSet object or as a directory path to a .fasta file that can be read by Biostrings::readAAStringSet, and (2) a path to which output files can be written.
More parameters for customization are documented at ?epfind
.
output.dir <- "AVARDA_vignette_data/" msa.path <- paste0(output.dir,"/msa.pdf") if(!file.exists(msa.path)){ epitopefindr::epfind(epitopefindr::pairwise_viral_hits, output.dir, e.thresh = 0.0001, min.groupsize = 6) unlink(paste0(output.dir,"intermediate_files"), recursive = TRUE, force = TRUE) }
We can print some of the sequence motifs from this output. Consider MSA 11 as an example. This 18 amino acid motif is represented in 7 of our input peptides. Many of these positions are fully conserved, but positions 3, 5, and 17 are only partially conserved. This illustrates that epitopefindr
makes use of BLAST to retain evidence for certain motifs that would be discarded with a gapped k-mer approach.
#convert pdf to png print.range <- 11 imgdir <- paste0(output.dir,"/png/") if(!dir.exists(imgdir)) dir.create(imgdir) imgnames <- paste0(imgdir,"msa-",print.range,".png") pdftools::pdf_convert(msa.path, format = "png", pages = print.range, filenames = imgnames, verbose = FALSE)
for(i in print.range){ #print new tab for each group cat(" \n### MSA", i, " \n") imgname <- paste0(imgdir,"msa-",i,".png") # iname[j] %>% readImage %>% display cat("![](",imgname,")",sep="") cat(" \n \n") }
The output of epitopefindr
can be fed into network graphing. Each vertex represents
a fragment of an input sequence. Each link represents an alignment between two peptide fragments. Group 11 is highlighted in blue.
aln <- data.table::fread(paste0(output.dir,"/finalAlignments.csv")) ver <- data.table::fread(paste0(output.dir,"/epitopeSummary.csv"), header = TRUE) g11 <- ver[,c("id","11")] g11 <- g11[!grep("NA$",g11$id),] g11$color <- ifelse(g11$`11` == "", "red","blue") set.seed(1) g <- igraph::graph_from_data_frame(aln[,1:2], directed = FALSE, vertices = g11) %>% (igraph::simplify) %>% plot(vertex.label = NA, vertex.size = 3, arrow.size = 0, vertex.color = g11$color)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.