featurePhysChem: Extraction of the Physicochemical Features of RNA and Protein...

View source: R/PhysicochemicalProperty.R

featurePhysChemR Documentation

Extraction of the Physicochemical Features of RNA and Protein Sequences

Description

Basically a wrapper for computePhysChem function. This function can extract physicochemical features of RNA and protein sequences at the same time and format the results as the dataset that can be used to build classifier.

Usage

featurePhysChem(
  seqRNA,
  seqPro,
  label = NULL,
  parallel.cores = 2,
  cl = NULL,
  ...
)

Arguments

seqRNA

RNA sequences loaded by function read.fasta from seqinr-package. Or a list of RNA sequences. RNA sequences will be converted into lower case letters. Each sequence should be a vector of single characters.

seqPro

protein sequences loaded by function read.fasta from seqinr-package. Or a list of protein sequences. Protein sequences will be converted into upper case letters. Each sequence should be a vector of single characters.

label

optional. A string or a vector of strings or NULL. Indicates the class of the samples such as "Interact", "Non.Interact". Default: NULL.

parallel.cores

an integer that indicates the number of cores for parallel computation. Default: 2. Set parallel.cores = -1 to run with all the cores. parallel.cores should be == -1 or >= 1.

cl

parallel cores to be passed to this function.

...

arguments (Fourier.len, physchemRNA and physchemPro) to be passed to computePhysChem. See computePhysChem and examples below.

Details

see computePhysChem.

Value

This function returns a data frame.

References

[1] Han S, Yang X, Sun H, et al. LION: an integrated R package for effective prediction of ncRNA–protein interaction. Briefings in Bioinformatics. 2022; 23(6):bbac420

[2] Morozova N, Allers J, Myers J, et al. Protein-RNA interactions: exploring binding patterns with a three-dimensional superposition analysis of high resolution structures. Bioinformatics 2006; 22:2746-52

[3] Grantham R. Amino acid difference formula to help explain protein evolution. Science 1974; 185:862-4

[4] Zimmerman JM, Eliezer N, Simha R. The characterization of amino acid sequences in proteins by statistical methods. J. Theor. Biol. 1968; 21:170-201

[5] Bull HB, Breese K. Surface tension of amino acid solutions: a hydrophobicity scale of the amino acid residues. Arch. Biochem. Biophys. 1974; 161:665-670

[6] Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982; 157:105-132

[7] Eisenberg D, Schwarz E, Komaromy M, et al. Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J. Mol. Biol. 1984; 179:125-42

[8] Hopp TP, Woods KR. Prediction of protein antigenic determinants from amino acid sequences. Proc. Natl. Acad. Sci. U. S. A. 1981; 78:3824-8

[9] Kawashima S, Kanehisa M. AAindex: amino acid index database. Nucleic Acids Res. 2000; 28:374

[10] Bellucci M, Agostini F, Masin M, et al. Predicting protein associations with long noncoding RNAs. Nat. Methods 2011; 8:444-445

[11] Lu Q, Ren S, Lu M, et al. Computational prediction of associations between long non-coding RNAs and proteins. BMC Genomics 2013; 14:651

See Also

computePhysChem

Examples

data(demoPositiveSeq)
seqsRNA <- demoPositiveSeq$RNA.positive
seqsPro <- demoPositiveSeq$Pro.positive

# Pass "Fourier.len", "physchemRNA" and "physchemPro" using "..." argument:

dataset1 <- featurePhysChem(seqRNA = seqsRNA, seqPro = seqsPro,
                            label = "Interact", Fourier.len = 10,
                            physchemRNA = c("hydrogenBonding", "vanderWaal"),
                            physchemPro = c("polarity.Grantham", "polarity.Zimmerman",
                                            "hphob.BullBreese", "hphob.KyteDoolittle",
                                            "hphob.Eisenberg", "hphob.HoppWoods"))

# Using the default setting:

dataset2 <- featurePhysChem(seqRNA = seqsRNA, seqPro = seqsPro)


HAN-Siyu/ncProR documentation built on Nov. 3, 2023, 12:08 a.m.