In-silico cleavage of polypeptides using the cleaver package

library("cleaver")
library("UniProt.ws")
library("BRAIN")

Introduction

Most proteomics experiments need protein (peptide) separation and cleavage procedures before these molecules could be analyzed or identified by mass spectrometry or other analytical tools.

r BiocStyle::Biocpkg("cleaver") allows in-silico cleavage of polypeptide sequences to e.g. create theoretical mass spectrometry data.

The cleavage rules are taken from the ExPASy PeptideCutter tool [@peptidecutter].

Simple Usage

Loading the r BiocStyle::Biocpkg("cleaver") package:

library("cleaver")

Getting help and list all available cleavage rules:

help("cleave")

Cleaving of Gastric juice peptide 1 (P01358) using Trypsin:

## cleave it
cleave("LAAGKVEDSD", enzym="trypsin")
## get the cleavage ranges
cleavageRanges("LAAGKVEDSD", enzym="trypsin")
## get only cleavage sites
cleavageSites("LAAGKVEDSD", enzym="trypsin")

Sometimes cleavage is not perfect and the enzym miss some cleavage positions:

## miss one cleavage position
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
## miss zero or one cleavage positions
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)

Combine r BiocStyle::Biocpkg("cleaver") and r BiocStyle::Biocpkg("Biostrings") [@Biostrings]:

## create AAStringSet object
p <- AAStringSet(c(gaju="LAAGKVEDSD", pnm="AGEPKLDAGV"))

## cleave it
cleave(p, enzym="trypsin")
cleavageRanges(p, enzym="trypsin")
cleavageSites(p, enzym="trypsin")

Insulin \& Somatostatin Example

Downloading Insulin (P01308) and Somatostatin (P61278) sequences from the UniProt [@uniprot] database using r BiocStyle::Biocpkg("UniProt.ws") [@UniProt.ws].

## load UniProt.ws library
library("UniProt.ws")

## select species Homo sapiens
UniProt.ws <- UniProt.ws(taxId=9606)

## download sequences of Insulin/Somatostatin
s <- select(UniProt.ws, keys=c("P01308", "P61278"), columns=c("SEQUENCE"))

## fetch only sequences
sequences <- setNames(s$SEQUENCE, s$UNIPROTKB)

## remove whitespaces
sequences <- gsub(pattern="[[:space:]]", replacement="", x=sequences)

Cleaving using Pepsin:

cleave(sequences, enzym="pepsin")

Isotopic Distribution Of Tryptic Digested Insulin

A common use case of in-silico cleavage is the calculation of the isotopic distribution of peptides (which were enzymatic digested in the in-vitro experimental workflow). Here r BiocStyle::Biocpkg("BRAIN") [@BRAIN; @BRAIN2] is used to calculate the isotopic distribution of r BiocStyle::Biocpkg("cleaver")'s output. (please note: it is only a toy example, e.g. the relation of intensity values between peptides isn't correct).

## load BRAIN library
library("BRAIN")

## cleave insulin
cleavedInsulin <- cleave(sequences[1], enzym="trypsin")[[1]]

## create empty plot area
plot(NA, xlim=c(150, 4300), ylim=c(0, 1),
     xlab="mass", ylab="relative intensity",
     main="tryptic digested insulin - isotopic distribution")

## loop through peptides
for (i in seq(along=cleavedInsulin)) {
  ## count C, H, N, O, S atoms in current peptide
  atoms <- BRAIN::getAtomsFromSeq(cleavedInsulin[[i]])
  ## calculate isotopic distribution
  d <- useBRAIN(atoms)
  ## draw peaks
  lines(d$masses, d$isoDistr, type="h", col=2)
}

Session Information

sessionInfo()

References



Try the cleaver package in your browser

Any scripts or data that you put into this service are public.

cleaver documentation built on Nov. 8, 2020, 7:20 p.m.