calculate_features: Calculate a set of numerical features from protein sequences

Description Usage Arguments Value Note References Examples

View source: R/calculate_features.R

Description

This function calculates set physicochemical and compositional features from protein sequences in preparation for supervised model learning

Usage

1
calculate_features(df, min_len = 10)

Arguments

df

A dataframe which contains protein sequence names as the first column and amino acid sequence as the second column

min_len

Minimum length sequence for which features can be calculated. It is an error to provide sequences with length shorter than this

Value

A dataframe containing numerical values related to the protein features of each given protein

Note

This function depends on the Peptides package

References

Osorio, D., Rondon-Villarreal, P. & Torres, R. Peptides: A package for data mining of antimicrobial peptides. The R Journal. 7(1), 4–14 (2015).

Examples

1
2
3
4
5
6
my_protein_df <- read_faa(system.file("extdata/bat_protein.fasta", package = "ampir"))

calculate_features(my_protein_df)
## Output (showing the first six output columns)
#      seq_name     Amphiphilicity  Hydrophobicity     pI          Mw       Charge    ....
# [1] G1P6H5_MYOLU	   0.4145847       0.4373494     8.501312     9013.757   4.53015   ....

ampir documentation built on June 29, 2021, 9:09 a.m.