gl.read.fasta: Reads FASTA files and converts them to genlight object

View source: R/gl.read.fasta.r

gl.read.fastaR Documentation

Reads FASTA files and converts them to genlight object

Description

The following IUPAC Ambiguity Codes are taken as heterozygotes:

  • M is heterozygote for AC and CA

  • R is heterozygote for AG and GA

  • W is heterozygote for AT and TA

  • S is heterozygote for CG and GC

  • Y is heterozygote for CT and TC

  • K is heterozygote for GT and TG

The following IUPAC Ambiguity Codes are taken as missing data:

  • V

  • H

  • D

  • B

  • N

The function can deal with missing data in individuals, e.g. when FASTA files have different number of individuals due to missing data.

The allele with the highest frequency is taken as the reference allele.

SNPs with more than two alleles are skipped.

Usage

gl.read.fasta(fasta_files, parallel = FALSE, n_cores = NULL, verbose = NULL)

Arguments

fasta_files

Fasta files to read [required].

parallel

A logical indicating whether multiple cores -if available- should be used for the computations (TRUE), or not (FALSE); requires the package parallel to be installed [default FALSE].

n_cores

If parallel is TRUE, the number of cores to be used in the computations; if NULL, then the maximum number of cores available on the computer is used [default NULL].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity].

Details

Ambiguity characters are often used to code heterozygotes. However, using heterozygotes as ambiguity characters may bias many estimates. See more information in the link below: https://evodify.com/heterozygotes-ambiguity-characters/

Value

A genlight object.

Author(s)

Custodian: Luis Mijangos – Post to https://groups.google.com/d/forum/dartr

Examples

 # Folder where the fasta files are located. 
 folder_samples <- system.file('extdata', package ='dartR')
 # listing the FASTA files, including their path. Files have an extension
 # that contains "fas".
 file_names <- list.files(path = folder_samples, pattern = "*.fas", 
                          full.names = TRUE)
 # reading fasta files
  obj <- gl.read.fasta(file_names)

green-striped-gecko/dartR documentation built on Sept. 7, 2024, 4:15 a.m.