read_fasta: Load a fasta file.
In castor: Efficient Phylogenetics on Large Trees

read_fasta

R Documentation

Load a fasta file.

Description

Efficiently load headers & sequences from a fasta file.

Usage

read_fasta(file,
		   include_headers		= TRUE,
		   include_sequences	= TRUE,
		   truncate_headers_at	= NULL)

Arguments

`file`	A character, path to the input fasta file. This file may be gzipped with extension ".gz".
`include_headers`	Logical, whether to load the headers. If you don't need the headers you can set this to `FALSE` for efficiency.
`include_sequences`	Logical, whether to load the sequences. If you don't need the sequences you can set this to `FALSE` for efficiency.
`truncate_headers_at`	Optional character, needle at which to truncate headers. Everything at and after the first instance of the needle will be removed from the headers.

Details

This function is a fast and simple fasta loader. Note that all sequences and headers are loaded into memory at once.

Value

A named list with the following elements:

`success`	Logical, indicating whether the file was loaded successfully. If FALSE, then an error message will be specified by the element `error`, and all other elements may be undefined.
`headers`	Character vector, listing the loaded headers in the order encountered. Only included if `include_headers` was `TRUE`.
`sequences`	Character vector, listing the loaded sequences in the order encountered. Only included if `include_sequences` was `TRUE`.
`Nlines`	Integer, number of lines encountered.
`Nsequences`	Integer, number of sequences encountered.

Author(s)

Stilianos Louca

Examples

## Not run: 
# load a gzipped fasta file
fasta = read_faste(file="myfasta.fasta.gz")

# print the first sequence
cat(fasta$sequences[1])

## End(Not run)

castor documentation built on Aug. 25, 2025, 1:10 a.m.