

Most proteomics experiments need protein (peptide) separation and cleavage procedures before these molecules could be analyzed or identified by mass spectrometry or other analytical tools.

r BiocStyle::Biocpkg("cleaver") allows in-silico cleavage of polypeptide sequences to e.g. create theoretical mass spectrometry data.

The cleavage rules are taken from the ExPASy PeptideCutter tool [@peptidecutter].

Simple Usage

Loading the r BiocStyle::Biocpkg("cleaver") package:


Getting help and list all available cleavage rules:


Cleaving of Gastric juice peptide 1 (P01358) using Trypsin:

## cleave it
cleave("LAAGKVEDSD", enzym="trypsin")
## get the cleavage ranges
cleavageRanges("LAAGKVEDSD", enzym="trypsin")
## get only cleavage sites
cleavageSites("LAAGKVEDSD", enzym="trypsin")

Sometimes cleavage is not perfect and the enzym miss some cleavage positions:

## miss one cleavage position
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
## miss zero or one cleavage positions
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)

Combine r BiocStyle::Biocpkg("cleaver") and r BiocStyle::Biocpkg("Biostrings") [@Biostrings]:

## create AAStringSet object
p <- AAStringSet(c(gaju="LAAGKVEDSD", pnm="AGEPKLDAGV"))

## cleave it
cleave(p, enzym="trypsin")
cleavageRanges(p, enzym="trypsin")
cleavageSites(p, enzym="trypsin")

Insulin \& Somatostatin Example

Downloading Insulin (P01308) and Somatostatin (P61278) sequences from the UniProt [@uniprot] database using r BiocStyle::Biocpkg("") [].

## load library

## select species Homo sapiens
up <-

## download sequences of Insulin/Somatostatin
s <- select(up,
    keys=c("P01308", "P61278"),

## fetch only sequences
sequences <- setNames(s$Sequence, s$Entry)

## remove whitespaces
sequences <- gsub(pattern="[[:space:]]", replacement="", x=sequences)

Cleaving using Pepsin:

cleave(sequences, enzym="pepsin")

Isotopic Distribution Of Tryptic Digested Insulin

A common use case of in-silico cleavage is the calculation of the isotopic distribution of peptides (which were enzymatic digested in the in-vitro experimental workflow). Here r BiocStyle::Biocpkg("BRAIN") [@BRAIN; @BRAIN2] is used to calculate the isotopic distribution of r BiocStyle::Biocpkg("cleaver")'s output. (please note: it is only a toy example, e.g. the relation of intensity values between peptides isn't correct).

## load BRAIN library

## cleave insulin
cleavedInsulin <- cleave(sequences[1], enzym="trypsin")[[1]]

## create empty plot area
plot(NA, xlim=c(150, 4300), ylim=c(0, 1),
     xlab="mass", ylab="relative intensity",
     main="tryptic digested insulin - isotopic distribution")

## loop through peptides
for (i in seq(along=cleavedInsulin)) {
  ## count C, H, N, O, S atoms in current peptide
  atoms <- BRAIN::getAtomsFromSeq(cleavedInsulin[[i]])
  ## calculate isotopic distribution
  d <- useBRAIN(atoms)
  ## draw peaks
  lines(d$masses, d$isoDistr, type="h", col=2)

