chargeCalculationGlobal: Protein Charge Calculation, Globally

Description Usage Arguments Value Plot Colors See Also Examples

View source: R/chargeCalculations.R

Description

This function will determine the charge of a peptide using the Henderson-Hasselbalch Equation. The output is a data frame (default) or a plot of charge calculations along the peptide sequence. Charges are determined globally, or along the entire chain.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
chargeCalculationGlobal(
  sequence,
  pKaSet = "IPC_protein",
  pH = 7,
  plotResults = FALSE,
  includeTermini = TRUE,
  sumTermini = TRUE,
  proteinName = NA,
  printCitation = FALSE,
  ...
)

Arguments

sequence

amino acid sequence as a character string or vector of individual residues. alternatively, a character string of the path to a .fasta / .fa file

pKaSet

A character string or data frame. "IPC_protein" by default. Character string to load specific, preloaded pKa sets. c("EMBOSS", "DTASelect", "Solomons", "Sillero", "Rodwell", "Lehninger", "Toseland", "Thurlkill", "Nozaki", "Dawson", "Bjellqvist", "ProMoST", "Vollhardt", "IPC_protein", "IPC_peptide") Alternatively, the user may supply a custom pKa dataset. The format must be a data frame where: Column 1 must be a character vector of residues named "AA" AND Column 2 must be a numeric vector of pKa values.

pH

numeric value, 7.0 by default. The environmental pH used to calculate residue charge.

plotResults

logical value, FALSE by default. This determines what is returned. If plotResults = FALSE, a data frame is returned with the position, residue, and charge (-1 to +1). If plotResults = TRUE, a graphical output is returned (ggplot) showing the charge distribution.

includeTermini, sumTermini

Logical values, both TRUE by default. This determines how the calculation handles the N- and C- terminus. includeTermini determines if the calculation will use the charge of the amine and carboxyl groups at the ends of the peptide (When TRUE). These charges are ignored when includeTermini = FALSE. sumTermini determines if the charge of the first (likely Met, therefore uncharged), and final residue (varies) will be added to the termini charges, or if the N and C terminus will be returned as separate residues. When sumTermini = TRUE, charges are summed. When sumTermini = FALSE, the N and C terminus are added as a unique residue in the DF. This will impact averages by increasing the sequence length by 2. sumTermini is ignored if includeTermini = FALSE.

proteinName

character string with length = 1. optional setting to include the name in the plot title.

printCitation

Logical value. FALSE by default. When printCitation = TRUE the citation for the pKa set is printed. This allows for the user to easily obtain the dataset citation. Will not print if there is a custom dataset.

...

any additional parameters, especially those for plotting.

Value

If plotResults = FALSE, a data frame is returned with the position, residue, and charge (-1 to +1). If plotResults = TRUE, a graphical output is returned (ggplot) showing the charge distribution.

Plot Colors

For users who wish to keep a common aesthetic, the following colors are used when plotResults = TRUE.

See Also

pKaData for residue pKa values and hendersonHasselbalch for charge calculations.

Other charge functions: chargeCalculationLocal(), hendersonHasselbalch(), netCharge()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
 #Amino acid sequences can be character strings
aaString <- "ACDEFGHIKLMNPQRSTVWY"
#Amino acid sequences can also be character vectors
aaVector <- c("A", "C", "D", "E", "F",
              "G", "H", "I", "K", "L",
              "M", "N", "P", "Q", "R",
              "S", "T", "V", "W", "Y")
#Alternatively, .fasta files can also be used by providing
#a character string of the path to the file.
exampleDF <- chargeCalculationGlobal(aaString)
head(exampleDF)
exampleDF <- chargeCalculationGlobal(aaVector)
head(exampleDF)


#Changing pKa set or pH used for calculations
exampleDF_pH5 <- chargeCalculationGlobal(aaString,
                                         pH = 5)
head(exampleDF_pH5)
exampleDF_pH7 <- chargeCalculationGlobal(aaString,
                                         pH = 7)
head(exampleDF_pH7)
exampleDF_EMBOSS <- chargeCalculationGlobal(aaString,
                                            pH = 7,
                                            pKa = "EMBOSS")
head(exampleDF_EMBOSS)

#If the termini charge should not be included with includeTermini = F
exampleDF_NoTermini <- chargeCalculationGlobal(aaString,
                                               includeTermini = FALSE)
head(exampleDF_NoTermini)

#and how the termini should be handeled with sumTermini
exampleDF_SumTermini <- chargeCalculationGlobal(aaString,
                                                sumTermini = TRUE)
head(exampleDF_SumTermini)
exampleDF_SepTermini <- chargeCalculationGlobal(aaString,
                                                sumTermini = FALSE)
head(exampleDF_SepTermini)

#plotResults = TRUE will output a ggplot as a line plot
  chargeCalculationGlobal(aaString,
                          plot = TRUE)

  #since it is a ggplot, you can change or annotate the plot
  gg <- chargeCalculationGlobal(aaVector,
                                window = 3,
                                plot = TRUE)
  gg <- gg + ggplot2::ylab("Residue Charge")
  gg <- gg + ggplot2::geom_text(data = exampleDF,
                                ggplot2::aes(label = AA,
                                             y = Charge + 0.1))
  plot(gg)
#alternatively, you can pass the data frame to sequenceMap()
sequenceMap(sequence = exampleDF$AA,
            property = exampleDF$Charge)

Example output

  Position AA     Charge
1        1  A  0.9920106
2        2  C -0.2179020
3        3  D -0.9992558
4        4  E -0.9974244
5        5  F  0.0000000
6        6  G  0.0000000
  Position AA     Charge
1        1  A  0.9920106
2        2  C -0.2179020
3        3  D -0.9992558
4        4  E -0.9974244
5        5  F  0.0000000
6        6  G  0.0000000
  Position AA      Charge
1        1  A  0.99991947
2        2  C -0.00277838
3        3  D -0.93068864
4        4  E -0.79476977
5        5  F  0.00000000
6        6  G  0.00000000
  Position AA     Charge
1        1  A  0.9920106
2        2  C -0.2179020
3        3  D -0.9992558
4        4  E -0.9974244
5        5  F  0.0000000
6        6  G  0.0000000
  Position AA      Charge
1        1  A  0.97549663
2        2  C -0.03065343
3        3  D -0.99920630
4        4  E -0.99874266
5        5  F  0.00000000
6        6  G  0.00000000
  Position AA     Charge
1        1  A  0.0000000
2        2  C -0.2179020
3        3  D -0.9992558
4        4  E -0.9974244
5        5  F  0.0000000
6        6  G  0.0000000
  Position AA     Charge
1        1  A  0.9920106
2        2  C -0.2179020
3        3  D -0.9992558
4        4  E -0.9974244
5        5  F  0.0000000
6        6  G  0.0000000
  Position  AA     Charge
1        0 NH3  0.9920106
2        1   A  0.0000000
3        2   C -0.2179020
4        3   D -0.9992558
5        4   E -0.9974244
6        5   F  0.0000000

idpr documentation built on Dec. 26, 2020, 6 p.m.