Working with the MapFuser package"

Introduction mapfuser package

The R package and the corresponding Shiny application aim to integrate the following common procedures in quantitative genetics and/or plant breeding:

Mapfuser is implemented as shiny app available at shinyapps.io

Getting started

Loading data

Input data can be supplied as delimited file (e.g. csv or tsv) with columns three columns: Marker (marker names), LG (linkage group names), and Position (position in centi Morgan). Alternatively, input data can be supplied in JoinMap format Optionally, a reference map can be loaded to allign linkage groups and compare a consensus map to the reference.

library(mapfuser)

## Load example data
fpath <- system.file("extdata", package="mapfuser")
AT_maps <- list.files(fpath, 
                      pattern = "Col", 
                      full.names = T)

MF.obj <- read_maps(mapfiles = AT_maps, 
                    type = "delim", 
                    sep = ",",
                    mapweights = rep(1,7))

## Load a reference map
ref_file <- list.files(fpath, 
                       pattern = "reference", 
                       full.names = T)

MF.obj <- read_ref(MF.obj = MF.obj, 
                   ref_file = ref_file, 
                   sep = ",", 
                   header = T,
                   na.string = NA,
                   type = "delim")

Quality control

Especially historical genetic maps may have inverted linkage groups. Genetic maps are alligned internally by finding a minimal path through each map weighted by the number of common markers between maps using a minimum spanning tree. Historical maps or small mapping project may also have limited number of markers. However, genetically mapped traits linked to these maps are still valuable to combine with current genetic maps. MapFuser implements a number of steps to ensure data is of sufficient quality to perform map integration and removes input data if necessary.

## Linkage groups with insufficient markers are removed and each linkage group is alligned to the reference. 
MF.obj <- map_qc(MF.obj, anchors = 3)
### Plot input data as network
plot(MF.obj, which = "mapnetwork", chr = 1)
### Plot qc passed data
plot(MF.obj, which = "mst", chr = 1)
##See ?plot.mapfuser for further options

Map Integration

A parallel version of LPmerge is implemented with the option to select the consensus map with lowest overall Root Mean Square Error (RMSE) automatically. The map_export function provides functionality to export a consensus map in JoinMap/mapchart format. The consensus map in simple delimited format is available in the MapFuser object

MF.obj <- LPmerge_par(MF.obj, n.cores = 2, max.interval = 1, max.int_sel = "auto")
plot(MF.obj, which = "single_map", maps = "consensus")

Modeling genetic vs. genomic distance

A thin-plate accurately captures the relationship between genetic and genomic distance. Sampling errors in single data points are smoothed out using the penalized approach, resulting in an accurate fit. The spline fit is possible for both a consensus map created using MapFuser and individual input genetic maps. The z argument can be used to remove severe outliers, which may negatively effect although the penalized spline in general is robust for outliers.

# Fit model
MF.obj <-  genphys_fit(MF.obj, type = "consensus", z = 5)
plot(MF.obj, which = "mareymap", maps = "consensus")

Predicting and interpolating/extrapolating

Often, genomic positions are available for genetic markers. However, genetic maps are difficult to combine when there are no common markers. The predict function can be used after a thin-plate spline is fitted.

# Read a table with positions to interpolate and/or extrapolate
predict_file <- list.files(fpath, 
                       pattern = "BaySha_physical", 
                       full.names = T)
to_predict <- read.table(predict_file, sep = ",", header = T)
# Predict, accessible under MF.obj$predictions
MF.obj <- predict(MF.obj, to_predict)

Further suggestions

For consensus map construction, round the genetic map positions of input data to one digit behind the decimal point to speed up the process. Various genetic map construction tools produce genetic position in centi Morgan up to four digits after the decimal point, which makes no sense unless a population consists of thousands of individuals combined with a flawless genotyping method. Given the relatively large sampling error (typically ±1 cM) in a mapping population (100-300 individiuals), cM positions become erroneous in terms of absolute positions. The LPmerge algorithm treats genetic positions as absolute constraints (marker1 > marker2), thus a small relative distance with a large sampling error leads to conflicting constraints. The conflicts need to be removed in iterative process, increasing the time it takes to create a consensus map.

MapFuser Shiny projects can be saved and loaded as rds objects for easy sharing of results or storage in database after conversion to JSON. Function parameters are saved in the MapFuser object for reproducibility as well as the sessionInfo when a MapFuser object is created using read_maps().



Try the mapfuser package in your browser

Any scripts or data that you put into this service are public.

mapfuser documentation built on Oct. 10, 2017, 5:07 p.m.