fasdirdf: Read in all FASTA files in a directory as a data.frame.
In vragh/seqvisr: Biological Sequence Visualization and Auxiliary Functions in R

fasdirdf

R Documentation

Read in all FASTA files in a directory as a data.frame.

Description

Wrapper around fastodf that reads in all FASTA files in a directory into a single data.frame. Uses list_files internally also. At present, fasdirf requires that all files being read in are of the same type (DNA or amino acid).

Usage

fasdirdf(path = NULL, pat = NULL, seqtype = c("DNA", "AA"), incl_filepath = TRUE)

Arguments

`path`	(character string, mandatory) the path to the directory containing the input FASTA files.
`pat`	(character string, optional) a regex string as used by list.files to specify which file/directory names should be returned.
`seqtype`	(character string, optional) the type of sequence being read in; DNA ("DNA") or amino acid ("AA"). Defaults to "DNA".
`incl_filepath`	(logical, optional) should the path to the file being read be included in the `data.frame`. Irrespective of whether the user sets this to TRUE or FALSE, a column (`filename`) will be included in the output `data.frame` to keep the number of columns consistent.

Value

a data.frame with the following columns: seqname, seq, and filename. If incl_filepath is set to TRUE, then filename will include the full path to the input file. It will be set to NA otherwise.

Examples

## Not run: 
#Input data
inpath <- dirname(system.file("extdata", "cdsearchr_testdata.fasta",
                              package = "seqvisr", mustWork = TRUE))
#Reading in some sample amino acid sequences
fasdirdf(path = inpath, seqtype = "AA", pat = "*_testdata.fasta")

## End(Not run)

vragh/seqvisr documentation built on April 20, 2024, 10:06 a.m.