read.fasta: Read FASTA file

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

Read and convert the fasta file to data frame

Usage

1
read.fasta(file = NULL, clean_name = FALSE)

Arguments

file

character string representing the name of the fasta file.

clean_name

logical, representing cleaning of the names will be performed. Punctuation characters and white space be replaced by "_" . See regex for more details.

Details

In this function, names of the sequences are identified by ">", and all the lines before next ">" will be concatenated.

Value

a data frame with two columns: (1) seq.name, the names for all the sequences. (2) seq.text, the raw sequence data.

Note

Punctuation characters and white space in the names of the sequences will be replaced by "_".

Author(s)

Jinlong Zhang <jinlongzhang01@gmail.com>

References

http://www.genomatix.de/online_help/help/sequence_formats.html

See Also

read.phylip,dat2fasta,dat2phylip,split_dat

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cat(
">seq_2", "GTCTTATAAGAAAGAATAAGAAAG--AAATACAAA-------AAAAAAGA",
">seq_3", "GTCTTATAAGAAAGAAATAGAAAAGTAAAAAAAAA-------AAAAAAAG",
">seq_5", "GACATAAGACATAAAATAGAATACTCAATCAGAAACCAACCCATAAAAAC",
">seq_8", "ATTCCAAAATAAAATACAAAAAGAAAAAACTAGAAAGTTTTTTTTCTTTG",
">seq_9", "ATTCTTTGTTCTTTTTTTTCTTTAATCTTTAAATAAACCTTTTTTTTTTA",
file = "trn1.fasta", sep = "\n")

res <- read.fasta("trn1.fasta")
unlink("trn1.fasta")

phylotools documentation built on May 2, 2019, 3:25 a.m.