Read large DNA alignments into R
fasta2DNAbin reads alignments with the fasta
format (extensions ".fasta", ".fas", or ".fa"), and outputs a
DNAbin object (the efficient DNA representation from the
ape package). The output contains either the full alignments, or only
SNPs. This implementation is designed for memory-efficiency,
and can read in larger datasets than Ape's
The function reads data by chunks of a few genomes (minimum 1, no
maximum) at a time, which allows one to read massive datasets with
negligible RAM requirements (albeit at a cost of computational
time). The argument
chunkSize indicates the number of genomes
read at a time. Increasing this value decreases the computational time
required to read data in, while increasing memory requirements.
a character string giving the path to the file to convert, with the extension ".fa", ".fas", or ".fasta".
a logical stating whether a conversion messages should be printed (FALSE, default) or not (TRUE).
an integer indicating the number of genomes to be read at a time; larger values require more RAM but decrease the time needed to read the data.
a logical indicating whether SNPs only should be returned.
an object of the class
Thibaut Jombart firstname.lastname@example.org
?DNAbin for a description of the class
read.snp: read SNPs in adegenet's '.snp' format.
read.PLINK: read SNPs in PLINK's '.raw' format.
df2genind: convert any multiallelic markers into
import2genind: read multiallelic markers from various
software into adegenet.
1 2 3 4 5 6 7 8 9 10 11
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.