read.fasta: Read FASTA formated Sequences

Description Usage Arguments Value Note Author(s) References

Description

Read aligned or un-aligned sequences from a FASTA format file.

Usage

1
read.fasta(file, rm.dup = TRUE, to.upper = FALSE, to.dash=TRUE)

Arguments

file

input sequence file.

rm.dup

logical, if TRUE duplicate sequences (with the same names/ids) will be removed.

to.upper

logical, if TRUE residues are forced to uppercase.

to.dash

logical, if TRUE ‘.’ gap characters are converted to ‘-’ gap characters.

Value

A list with two components:

ali

an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.

ids

sequence names as identifers.

Note

For a description of FASTA format see: http://www.ebi.ac.uk/help/formats_frame.html. When reading alignment files, the dash ‘-’ is interpreted as the gap character.

Author(s)

Barry Grant

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.


BioPhysConnectoR documentation built on May 30, 2017, 6:46 a.m.