library("cleaver")
library("UniProt.ws")
library("BRAIN")

Introduction

Most proteomics experiments need protein (peptide) separation and cleavage procedures before these molecules could be analyzed or identified by mass spectrometry or other analytical tools.

r BiocStyle::Biocpkg("cleaver") allows in-silico cleavage of polypeptide sequences to e.g. create theoretical mass spectrometry data.

The cleavage rules are taken from the ExPASy PeptideCutter tool [@peptidecutter].

Simple Usage

Loading the r BiocStyle::Biocpkg("cleaver") package:

library("cleaver")

Getting help and list all available cleavage rules:

help("cleave")

Cleaving of Gastric juice peptide 1 (P01358) using Trypsin:

## cleave it
cleave("LAAGKVEDSD", enzym="trypsin")
## get the cleavage ranges
cleavageRanges("LAAGKVEDSD", enzym="trypsin")
## get only cleavage sites
cleavageSites("LAAGKVEDSD", enzym="trypsin")

Sometimes cleavage is not perfect and the enzym miss some cleavage positions:

## miss one cleavage position
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=1)
## miss zero or one cleavage positions
cleave("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)
cleavageRanges("LAAGKVEDSD", enzym="trypsin", missedCleavages=0:1)

Combine r BiocStyle::Biocpkg("cleaver") and r BiocStyle::Biocpkg("Biostrings") [@Biostrings]:

## create AAStringSet object
p <- AAStringSet(c(gaju="LAAGKVEDSD", pnm="AGEPKLDAGV"))

## cleave it
cleave(p, enzym="trypsin")
cleavageRanges(p, enzym="trypsin")
cleavageSites(p, enzym="trypsin")

Insulin \& Somatostatin Example

Downloading Insulin (P01308) and Somatostatin (P61278) sequences from the UniProt [@uniprot] database using r BiocStyle::Biocpkg("UniProt.ws") [@UniProt.ws].

## load UniProt.ws library
library("UniProt.ws")

## select species Homo sapiens
up <- UniProt.ws(taxId=9606)

## download sequences of Insulin/Somatostatin
s <- select(up,
    keys=c("P01308", "P61278"),
    columns=c("sequence"),
    keytype="UniProtKB"
)

## fetch only sequences
sequences <- setNames(s$Sequence, s$Entry)

## remove whitespaces
sequences <- gsub(pattern="[[:space:]]", replacement="", x=sequences)

Cleaving using Pepsin:

cleave(sequences, enzym="pepsin")

Isotopic Distribution Of Tryptic Digested Insulin

A common use case of in-silico cleavage is the calculation of the isotopic distribution of peptides (which were enzymatic digested in the in-vitro experimental workflow). Here r BiocStyle::Biocpkg("BRAIN") [@BRAIN; @BRAIN2] is used to calculate the isotopic distribution of r BiocStyle::Biocpkg("cleaver")'s output. (please note: it is only a toy example, e.g. the relation of intensity values between peptides isn't correct).

## load BRAIN library
library("BRAIN")

## cleave insulin
cleavedInsulin <- cleave(sequences[1], enzym="trypsin")[[1]]

## create empty plot area
plot(NA, xlim=c(150, 4300), ylim=c(0, 1),
     xlab="mass", ylab="relative intensity",
     main="tryptic digested insulin - isotopic distribution")

## loop through peptides
for (i in seq(along=cleavedInsulin)) {
  ## count C, H, N, O, S atoms in current peptide
  atoms <- BRAIN::getAtomsFromSeq(cleavedInsulin[[i]])
  ## calculate isotopic distribution
  d <- useBRAIN(atoms)
  ## draw peaks
  lines(d$masses, d$isoDistr, type="h", col=2)
}

Session Information

sessionInfo()

References



sgibb/cleaver documentation built on Jan. 26, 2024, 10:08 a.m.