gl.read.silicodart: Imports presence/absence data from SilicoDArT to genlight...

View source: R/gl.read.silicodart.r

gl.read.silicodartR Documentation

Imports presence/absence data from SilicoDArT to genlight {agegenet} format (ploidy=1)

Description

DaRT provide the data as a matrix of entities (individual animals) across the top and attributes (P/A of sequenced fragment) down the side in a format that is unique to DArT. This program reads the data in to adegenet format for consistency with other programming activity. The script may require modification as DArT modify their data formats from time to time.

Usage

gl.read.silicodart(
  filename,
  ind.metafile = NULL,
  nas = "-",
  topskip = NULL,
  lastmetric = "Reproducibility",
  probar = TRUE,
  verbose = NULL
)

Arguments

filename

Name of csv file containing the SilicoDArT data [required].

ind.metafile

Name of csv file containing metadata assigned to each entity (individual) [default NULL].

nas

Missing data character [default '-'].

topskip

Number of rows to skip before the header row (containing the specimen identities) [optional].

lastmetric

Specifies the last non genetic column (Default is 'Reproducibility'). Be sure to check if that is true, otherwise the number of individuals will not match. You can also specify the last column by a number [default "Reproducibility"].

probar

Show progress bar [default TRUE].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, or as set by gl.set.verbose()].

Details

gl.read.silicodart() opens the data file (csv comma delimited) and skips the first n=topskip lines. The script assumes that the next line contains the entity labels (specimen ids) followed immediately by the SNP data for the first locus.

It reads the presence/absence data into a matrix of 1s and 0s, and inputs the locus metadata and specimen metadata. The locus metadata comprises a series of columns of values for each locus including the essential columns of CloneID and the desirable variables Reproducibility and PIC. Refer to documentation provide by DArT for an explanation of these columns.

The specimen metadata provides the opportunity to reassign specimens to populations, and to add other data relevant to the specimen. The key variables are id (specimen identity which must be the same and in the same order as the SilicoDArT file, each unique), pop (population assignment), lat (latitude, optional) and lon (longitude, optional). id, pop, lat, lon are the column headers in the csv file. Other optional columns can be added.

The data matrix, locus names (forced to be unique), locus metadata, specimen names, specimen metadata are combined into a genind object. Refer to the documentation for {adegenet} for further details.

Value

An object of class genlight with ploidy set to 1, containing the presence/absence data, and locus and individual metadata.

Author(s)

Custodian: Bernd Gruber – Post to https://groups.google.com/d/forum/dartr

See Also

gl.read.dart

Examples

silicodartfile <- system.file('extdata','testset_SilicoDArT.csv', package='dartR')
metadata <- system.file('extdata',ind.metafile ='testset_metadata_silicodart.csv', package='dartR')
testset.gs <- gl.read.silicodart(filename = silicodartfile, ind.metafile = metadata)

green-striped-gecko/dartR documentation built on Sept. 7, 2024, 4:15 a.m.