read_geno_csv: Data Input in CSV format

View source: R/read_mappoly_csv.R

read_geno_csvR Documentation

Data Input in CSV format

Description

Reads an external comma-separated values (CSV) data file. The format of the file is described in the Details section. This function creates an object of class mappoly.data.

Usage

read_geno_csv(
  file.in,
  ploidy,
  filter.non.conforming = TRUE,
  elim.redundant = TRUE,
  verbose = TRUE
)

Arguments

file.in

a character string with the name of (or full path to) the input file containing the data to be read

ploidy

the ploidy level

filter.non.conforming

if TRUE (default) converts data points with unexpected genotypes (i.e. no double reduction) to 'NA'. See function segreg_poly for information on expected classes and their respective frequencies.

elim.redundant

logical. If TRUE (default), removes redundant markers during map construction, keeping them annotated to export to the final map.

verbose

if TRUE (default), the current progress is shown; if FALSE, no output is produced

Details

This is an alternative and a somewhat more straightforward version of the function read_geno. The input is a standard CSV file where the rows represent the markers, except for the first row which is used as a header. The first five columns contain the marker names, the dosage in parents 1 and 2, the chromosome information (i.e. chromosome, scaffold, contig, etc) and the position of the marker within the sequence. The remaining columns contain the dosage of the full-sib population. A tetraploid example of such file can be found in the Examples section.

Value

An object of class mappoly.data which contains a list with the following components:

ploidy

ploidy level

n.ind

number individuals

n.mrk

total number of markers

ind.names

the names of the individuals

mrk.names

the names of the markers

dosage.p1

a vector containing the dosage in parent P for all n.mrk markers

dosage.p2

a vector containing the dosage in parent Q for all n.mrk markers

chrom

a vector indicating which sequence each marker belongs. Zero indicates that the marker was not assigned to any sequence

genome.pos

Physical position of the markers into the sequence

seq.ref

NULL (unused in this type of data)

seq.alt

NULL (unused in this type of data)

all.mrk.depth

NULL (unused in this type of data)

geno.dose

a matrix containing the dosage for each markers (rows) for each individual (columns). Missing data are represented by ploidy_level + 1

n.phen

number of phenotypic traits

phen

a matrix containing the phenotypic data. The rows correspond to the traits and the columns correspond to the individuals

kept

if elim.redundant = TRUE, holds all non-redundant markers

elim.correspondence

if elim.redundant = TRUE, holds all non-redundant markers and its equivalence to the redundant ones

Author(s)

Marcelo Mollinari, mmollin@ncsu.edu, with minor changes by Gabriel Gesteira, gdesiqu@ncsu.edu

References

Mollinari M., Olukolu B. A., Pereira G. da S., Khan A., Gemenet D., Yencho G. C., Zeng Z-B. (2020), Unraveling the Hexaploid Sweetpotato Inheritance Using Ultra-Dense Multilocus Mapping, _G3: Genes, Genomes, Genetics_. doi: 10.1534/g3.119.400620

Mollinari, M., and Garcia, A. A. F. (2019) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models, _G3: Genes, Genomes, Genetics_. doi: 10.1534/g3.119.400378

Examples


#### Tetraploid Example
ft = "https://raw.githubusercontent.com/mmollina/MAPpoly_vignettes/master/data/tetra_solcap.csv"
tempfl <- tempfile()
download.file(ft, destfile = tempfl)
SolCAP.dose <- read_geno_csv(file.in  = tempfl, ploidy = 4)
print(SolCAP.dose, detailed = TRUE)
plot(SolCAP.dose)


mappoly documentation built on Jan. 6, 2023, 1:16 a.m.