readFASTA: Read Protein Sequences in FASTA Format

Description Usage Arguments Details Value Note Author(s) References See Also Examples


Read Protein Sequences in FASTA Format


readFASTA(file = system.file("protseq/P00750.fasta", package = "protr"),
  legacy.mode = TRUE, seqonly = FALSE)



The name of the file which the sequences in fasta format are to be read from. If it does not contain an absolute or relative path, the file name is relative to the current working directory, getwd. The default here is to read the P00750.fasta file which is present in the protseq directory of the protr package.


If set to TRUE, lines starting with a semicolon ';' are ignored. Default value is TRUE.


If set to TRUE, only sequences as returned without attempt to modify them or to get their names and annotations (execution time is divided approximately by a factor 3). Default value is FALSE.


This function reads protein sequences in FASTA format.


The result character vector

The three returned argument are just different forms of the same output. If one is interested in a Mahalanobis metric over the original data space, the first argument is all she/he needs. If a transformation into another space (where one can use the Euclidean metric) is preferred, the second returned argument is sufficient. Using A and B is equivalent in the following sense.


Note that any different sets of instances (chunklets), e.g. 1, 3, 7 and 4, 6, might belong to the same class and might belong to different classes.


Nan Xiao <>


Pearson, W.R. and Lipman, D.J. (1988) Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences of the United States of America, 85: 2444-2448

See Also

See getUniProt for retrieving protein sequences from


P00750 = readFASTA(system.file('protseq/P00750.fasta', package = 'protr'))

Search within the protr package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.