read.fasta: Read FASTA formated Sequences
In BioPhysConnectoR: BioPhysConnectoR

Read aligned or un-aligned sequences from a FASTA format file.

1	read.fasta(file, rm.dup = TRUE, to.upper = FALSE, to.dash=TRUE)

`file`	input sequence file.
`rm.dup`	logical, if TRUE duplicate sequences (with the same names/ids) will be removed.
`to.upper`	logical, if TRUE residues are forced to uppercase.
`to.dash`	logical, if TRUE ‘.’ gap characters are converted to ‘-’ gap characters.

A list with two components:

`ali`	an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.
`ids`	sequence names as identifers.

For a description of FASTA format see: http://www.ebi.ac.uk/help/formats_frame.html. When reading alignment files, the dash ‘-’ is interpreted as the gap character.

Barry Grant

Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.

BioPhysConnectoR documentation built on May 30, 2017, 6:46 a.m.