chargeHydropathyPlot: Charge-Hydropathy Plot

Description Usage Arguments Value Plot Colors References See Also Examples

View source: R/chargeHydropathyPlot.R

Description

This function calculates the average net charge <R> and the average scaled hydropathy <H> and visualizes the data. There are known boundaries on the C-H plot that separate extended and collapsed proteins.
This was originally described in Uversky et al. (2000)
https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7 .
The plot returned is based on the charge-hydropathy plot from Uversky (2016) https://doi.org/10.1080/21690707.2015.1135015.
See Uversky (2019) https://doi.org/10.3389/fphy.2019.00010 for additional information and a recent review on the topic. This plot has also been referred to as a "Uversky Plot".

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
chargeHydropathyPlot(
  sequence,
  displayInsolubility = TRUE,
  insolubleValue = 0.7,
  proteinName = NA,
  customPlotTitle = NA,
  pH = 7,
  pKaSet = "IPC_protein",
  plotResults = TRUE,
  ...
)

Arguments

sequence

amino acid sequence (or pathway to a fasta file) as a character string. Supports multiple sequences / files, as a character vector of strings. Additionally, this supports a single protein as character vectors. Multiple proteins are not supported as a character vector of single characters.

displayInsolubility

logical value, TRUE by default. This adds (or removes when FALSE) the vertical line separating collapsed proteins and insoluble proteins

insolubleValue

numerical value. 0.7 by default. Ignored when displayInsolubility = FALSE. Plots the vertical line <H> = displayInsolubility.

proteinName, customPlotTitle

optional character string. NA by default. Used to either add the name of the protein to the plot title when there is only one protein, or to create a custom plot title for the output.

pH

numeric value, 7.0 by default. The environmental pH is used to calculate residue charge.

pKaSet

pKa set used for charge calculations. See netCharge for additional details

plotResults

logical value, TRUE by default. This determines what is returned. If plotResults = FALSE, a data frame is returned with the Sequence(s), Average Scaled Hydropathy, and Average Net Charge. If plotResults = TRUE, a graphical output is returned (ggplot) showing the Charge Hydropathy Plot (recommended).

...

additional arguments to be passed to idpr::netCharge(), idpr::meanScaledHydropathy() or ggplot

Value

Graphical values of Charge-Hydropathy Plot

Plot Colors

For users who wish to keep a common aesthetic, the following colors are used when plotResults = TRUE.

References

Kozlowski, L. P. (2016). IPC – Isoelectric Point Calculator. Biology Direct, 11(1), 55. https://doi.org/10.1186/s13062-016-0159-9
Kyte, J., & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. Journal of molecular biology, 157(1), 105-132.
Uversky, V. N. (2019). Intrinsically Disordered Proteins and Their “Mysterious” (Meta)Physics. Frontiers in Physics, 7(10). https://doi.org/10.3389/fphy.2019.00010
Uversky, V. N. (2016). Paradoxes and wonders of intrinsic disorder: Complexity of simplicity. Intrinsically Disordered Proteins, 4(1), e1135015. https://doi.org/10.1080/21690707.2015.1135015
Uversky, V. N., Gillespie, J. R., & Fink, A. L. (2000). Why are “natively unfolded” proteins unstructured under physiologic conditions?. Proteins: structure, function, and bioinformatics, 41(3), 415-427. https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7

See Also

netCharge and meanScaledHydropathy for functions used to calculate values.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#Amino acid sequences can be character strings
aaString <- "ACDEFGHIKLMNPQRSTVWY"
#Amino acid sequences can also be character vectors
aaVector <- c("A", "C", "D", "E", "F",
              "G", "H", "I", "K", "L",
              "M", "N", "P", "Q", "R",
              "S", "T", "V", "W", "Y")
#Alternatively, .fasta files can also be used by providing
##The path to the file as a character string
chargeHydropathyPlot(sequence = aaString)
chargeHydropathyPlot( sequence = aaVector)

#This function also supports multiple sequences
#only as character strings or .fasta files
multipleSeq <- c("ACDEFGHIKLMNPQRSTVWY",
               "ACDEFGHIK",
               "LMNPQRSTVW")
chargeHydropathyPlot(sequence = multipleSeq)

#since it is a ggplot, we can add additional annotations or themes
chargeHydropathyPlot(
 sequence = multipleSeq)  +
  ggplot2::theme_void()

chargeHydropathyPlot(
  sequence = multipleSeq)  +
  ggplot2::geom_hline(yintercept = 0,
                     color = "red")

#choosing the pKa set used for calculations
chargeHydropathyPlot(
  sequence = multipleSeq,
  pKaSet = "EMBOSS")

idpr documentation built on Dec. 26, 2020, 6 p.m.