fasdirdf: Read in all FASTA files in a directory as a data.frame.

View source: R/fastodf.R

fasdirdfR Documentation

Read in all FASTA files in a directory as a data.frame.

Description

Wrapper around fastodf that reads in all FASTA files in a directory into a single data.frame. Uses list_files internally also. At present, fasdirf requires that all files being read in are of the same type (DNA or amino acid).

Usage

fasdirdf(path = NULL, pat = NULL, seqtype = c("DNA", "AA"), incl_filepath = TRUE)

Arguments

path

(character string, mandatory) the path to the directory containing the input FASTA files.

pat

(character string, optional) a regex string as used by list.files to specify which file/directory names should be returned.

seqtype

(character string, optional) the type of sequence being read in; DNA ("DNA") or amino acid ("AA"). Defaults to "DNA".

incl_filepath

(logical, optional) should the path to the file being read be included in the data.frame. Irrespective of whether the user sets this to TRUE or FALSE, a column (filename) will be included in the output data.frame to keep the number of columns consistent.

Value

a data.frame with the following columns: seqname, seq, and filename. If incl_filepath is set to TRUE, then filename will include the full path to the input file. It will be set to NA otherwise.

Examples

## Not run: 
#Input data
inpath <- dirname(system.file("extdata", "cdsearchr_testdata.fasta",
                              package = "seqvisr", mustWork = TRUE))
#Reading in some sample amino acid sequences
fasdirdf(path = inpath, seqtype = "AA", pat = "*_testdata.fasta")

## End(Not run)


vragh/seqvisr documentation built on April 20, 2024, 10:06 a.m.