DT2RADdata: Convert a long-format data table into polyRAD's RADdata...

View source: R/DT2RADdata.R

DT2RADdataR Documentation

Convert a long-format data table into polyRAD's RADdata object class.

Description

Used to generate the RADdata class object from the polyRAD package described in Clark et al. (2019).

Usage

DT2RADdata(
  dat,
  chromCol = "CHROM",
  posCol = "POS",
  locusCol = "LOCUS",
  sampCol = "SAMPLE",
  refCol = "REF",
  roCol = "RO",
  altCol = "ALT",
  aoCol = "AO",
  possPloidy = list(2L),
  sampPloidy = 2L,
  contamRate = 0.001
)

Arguments

dat

Data.table: A data table of read counts for genotypes loci. Expects the columns:

  1. The chromosome (contig) ID (see param chromCol).

  2. The position information (see param posCol).

  3. The locus ID (see param locusCol).

  4. The sample ID (see param sampCol).

  5. The reference allele nucleotides (see param refCol).

  6. The reference allele read counts (see param roCol).

  7. The alternate allele nucleotides (see param altCol).

  8. The alternate allele read counts (see param aoCol). (see param freqCol)

chromCol

Character: The column name with the chromosome information. Default = 'CHROM'.

posCol

Character: The column name with the position information. Default = 'POS'.

locusCol

Character: The column name with the locus information. Default = 'LOCUS'.

sampCol

Character: The column name with the sampled individual information. Default = 'SAMPLE'. Only needed when type=='genos'.

refCol

Character: The column with the reference allele nucleotide information. Default = 'REF'.

roCol

Character: The column with the reference allele read count information. Default = 'RO'.

altCol

Character: The column with the alternate allele nucleotide information. Default = 'ALT'.

aoCol

Character: The column with the alternate allele read count information. Default = 'AO'.

possPloidy

List: A list of integers or numerics that represent the unique ploidy values in the dataset. Default is list(2), i.e., all samples have a ploidy of 2. A list of list(2, 4) would represent possible ploidies of 2 and 4.

sampPloidy

Integer/Numeric: Either a single value or a named vector of values for each sample. Default is a single value, 2, i.e., all samples have a ploidy of 2. A vector c('Ind1'=2, 'Ind2'=4, 'Ind3'=2), for example, is a vector of ploidies for three individuals, with ploidy values of 2, 4, and 2 for individuals 1, 2 and 3, respectively.

contamRate

Numeric: The contamination rate. Default is 0.001.

References

Clark et al. (2019). G3. DOI: 10.1534/g3.118.200913

Examples

library(genomalicious)
data(data_Genos)

# Using a single ploidy
RD1 <- DT2RADdata(data_Genos, sampPloidy=2)

# Using a vector of ploidies.
samps_uniq <- unique(data_Genos$SAMPLE)
samps_ploid <- rep(2, length(samps_uniq))
names(samps_ploid) <- samps_uniq

samps_ploid

RD2 <- DT2RADdata(data_Genos, sampPloidy=samps_ploid)

Specifying multiple ploidies.
samps_ploid[20:40] <- 4

samps_ploid %>% table

RD3 <- DT2RADdata(data_Genos, possPloidy=list(2, 4), sampPloidy=samps_ploid)


j-a-thia/genomalicious documentation built on Oct. 19, 2024, 7:51 p.m.