R/data.R

#' Synthetic x Opata genetic map, 92 DH lines
#' 
#' A dataframe adapted from Supplementary Dataset S7 of Gutierrez-Gonzalez et 
#' al., 2019 containing map data on SNPs generated by a PstI-MspI double digest 
#' genotyping-by-sequencing (GBS) protocol, with physical positions from 
#' the International Wheat Genome Sequencing Consortium (IWGSC) v1.0 
#' Chinese Spring wheat Reference Sequence and genetic positions derived from a
#' population of 92 doubled-haploid (DH) lines descended from the Synthetic x 
#' Opata cross. Gutierrez-Gonzalez et al. performed imputation using the R
#' package LaByRInth (https://github.com/Dordt-Statistics-Research/LaByRInth),
#' which is an adaptation of the LB-Impute algorithm. Several outlier SNPs were
#' removed from this dataset, leaving 92,272 markers.
#' 
#' This data is in the format required for PLINK .map files. However, note that
#' many programs that use PLINK files may require the header to be removed, and
#' may require genetic distances to be converted to Morgans.
#' 
#' @format A data frame with 7,742 rows and 4 variables:
#' \describe{
#'    \item{chr}{chromosome name}
#'    \item{id}{SNP identifier}
#'    \item{dist}{genetic position, in centiMorgans (cM)}
#'    \item{pos}{phyiscal position, in base pairs (bp)}
#' }
#' @source \url{https://doi.org/10.1038/s41598-018-38111-3}
"synop_dh92"

#' Synthetic x Opata genetic map, 906 RIL lines
#' 
#' A dataframe adapted from Supplementary Dataset S6 of Gutierrez-Gonzalez et 
#' al., 2019 containing map data on SNPs generated by a PstI-MspI double digest 
#' genotyping-by-sequencing (GBS) protocol, with physical positions from 
#' the International Wheat Genome Sequencing Consortium (IWGSC) v1.0 
#' Chinese Spring wheat Reference Sequence and genetic positions derived from a
#' population of 906 recombinant inbred lines (RILs) descended from the Synthetic 
#' x Opata cross. Gutierrez-Gonzalez et al. performed imputation using the R
#' package LaByRInth (https://github.com/Dordt-Statistics-Research/LaByRInth),
#' which is an adaptation of the LB-Impute algorithm. Several outlier SNPs were
#' removed from this dataset, leaving 46,600 markers.
#' 
#' This data is in the format required for PLINK .map files. However, note that
#' many programs that use PLINK files may require the header to be removed, and
#' may require genetic distances to be converted to Morgans.
#' 
#' @format A data frame with 4,027 rows and 3 variables:
#' \describe{
#'    \item{chr}{chromosome name}
#'    \item{id}{SNP identifier}
#'    \item{dist}{genetic position, in centiMorgans (cM)}
#'    \item{pos}{phyiscal position, in base pairs (bp)}
#' }
#' @source \url{https://doi.org/10.1038/s41598-018-38111-3}
"synop_ril906"
etnite/bwardr documentation built on Jan. 6, 2023, 7:12 a.m.