readFasta | R Documentation |
Reads and writes biological sequences (DNA, RNA, protein) in the FASTA format.
readFasta(in.file)
writeFasta(fdta, out.file, width = 0)
in.file |
url/directory/name of (gzipped) FASTA file to read. |
fdta |
A |
out.file |
Name of (gzipped) FASTA file to create. |
width |
Number of characters per line, or 0 for no linebreaks. |
These functions handle input/output of sequences in the commonly used FASTA format.
For every sequence it is presumed there is one Header-line starting with a ‘>’. If
filenames (in.file
or out.file
) have the extension .gz
they will automatically be
compressed/uncompressed.
The sequences are stored in a tibble
, opening up all the possibilities in R for
fast and easy manipulations. The content of the file is stored as two columns, ‘Header’
and ‘Sequence’. If other columns are added, these will be ignored by writeFasta
.
The default width = 0
in writeFasta
results in no linebreaks in the sequences
(one sequence per line).
readFasta
returns a tibble
with the contents of the (gzipped) FASTA
file stored in two columns of text. The first, named ‘Header’, contains
the headerlines and the second, named ‘Sequence’, contains the sequences.
writeFasta
produces a (gzipped) FASTA file.
Lars Snipen and Kristian Hovde Liland.
readFastq
.
## Not run:
# We need a FASTA-file to read, here is one example file:
fa.file <- file.path(file.path(path.package("microseq"),"extdata"),"small.ffn")
# Read and write
fdta <- readFasta(fa.file)
ok <- writeFasta(fdta[4:5,], out.file = "delete_me.fasta")
# Make use of dplyr to copy parts of the file to another file
readFasta(fa.file) %>%
filter(str_detect(Sequence, "TGA$")) %>%
writeFasta(out.file = "TGAstop.fasta", width = 80) -> ok
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.