est_rf_hmm: Multipoint analysis using Hidden Markov Models in...

View source: R/est_map_hmm.R

est_rf_hmmR Documentation

Multipoint analysis using Hidden Markov Models in autopolyploids

Description

Performs the multipoint analysis proposed by Mollinari and Garcia (2019) in a sequence of markers

Usage

est_rf_hmm(
  input.seq,
  input.ph = NULL,
  thres = 0.5,
  twopt = NULL,
  verbose = FALSE,
  tol = 1e-04,
  est.given.0.rf = FALSE,
  reestimate.single.ph.configuration = TRUE,
  high.prec = TRUE
)

## S3 method for class 'mappoly.map'
print(x, detailed = FALSE, ...)

## S3 method for class 'mappoly.map'
plot(
  x,
  left.lim = 0,
  right.lim = Inf,
  phase = TRUE,
  mrk.names = FALSE,
  cex = 1,
  config = "best",
  P = "Parent 1",
  Q = "Parent 2",
  xlim = NULL,
  ...
)

Arguments

input.seq

an object of class mappoly.sequence

input.ph

an object of class two.pts.linkage.phases. If not available (default = NULL), it will be computed

thres

LOD Score threshold used to determine if the linkage phases compared via two-point analysis should be considered. Smaller values will result in smaller number of linkage phase configurations to be evaluated by the multipoint algorithm.

twopt

an object of class mappoly.twopt containing two-point information

verbose

if TRUE, current progress is shown; if FALSE (default), no output is produced

tol

the desired accuracy (default = 1e-04)

est.given.0.rf

logical. If TRUE returns a map forcing all recombination fractions equals to 0 (1e-5, for internal use only. Default = FALSE)

reestimate.single.ph.configuration

logical. If TRUE returns a map without re-estimating the map parameters for cases where there is only one possible linkage phase configuration. This argument is intended to be used in a sequential map construction

high.prec

logical. If TRUE (default) uses high precision long double numbers in the HMM procedure

x

an object of the class mappoly.map

detailed

logical. if TRUE, prints the linkage phase configuration and the marker position for all maps. If FALSE (default), prints a map summary

...

currently ignored

left.lim

the left limit of the plot (in cM, default = 0).

right.lim

the right limit of the plot (in cM, default = Inf, i.e., will print the entire map)

phase

logical. If TRUE (default) plots the phase configuration for both parents

mrk.names

if TRUE, marker names are displayed (default = FALSE)

cex

The magnification to be used for marker names

config

should be 'best' or the position of the configuration to be plotted. If 'best', plot the configuration with the highest likelihood

P

a string containing the name of parent P

Q

a string containing the name of parent Q

xlim

range of the x-axis. If xlim = NULL (default) it uses the map range.

Details

This function first enumerates a set of linkage phase configurations based on two-point recombination fraction information using a threshold provided by the user (argument thresh). After that, for each configuration, it reconstructs the genetic map using the HMM approach described in Mollinari and Garcia (2019). As result, it returns the multipoint likelihood for each configuration in form of LOD Score comparing each configuration to the most likely one. It is recommended to use a small number of markers (e.g. 50 markers for hexaploids) since the possible linkage phase combinations bounded only by the two-point information can be huge. Also, it can be quite sensible to small changes in 'thresh'. For a large number of markers, please see est_rf_hmm_sequential.

Value

A list of class mappoly.map with two elements:

i) info: a list containing information about the map, regardless of the linkage phase configuration:

ploidy

the ploidy level

n.mrk

number of markers

seq.num

a vector containing the (ordered) indices of markers in the map, according to the input file

mrk.names

the names of markers in the map

seq.dose.p1

a vector containing the dosage in parent 1 for all markers in the map

seq.dose.p2

a vector containing the dosage in parent 2 for all markers in the map

chrom

a vector indicating the sequence (usually chromosome) each marker belongs as informed in the input file. If not available, chrom = NULL

genome.pos

physical position (usually in megabase) of the markers into the sequence

seq.ref

reference base used for each marker (i.e. A, T, C, G). If not available, seq.ref = NULL

seq.alt

alternative base used for each marker (i.e. A, T, C, G). If not available, seq.ref = NULL

chisq.pval

a vector containing p-values of the chi-squared test of Mendelian segregation for all markers in the map

data.name

name of the dataset of class mappoly.data

ph.thres

the LOD threshold used to define the linkage phase configurations to test

ii) a list of maps with possible linkage phase configuration. Each map in the list is also a list containing

seq.num

a vector containing the (ordered) indices of markers in the map, according to the input file

seq.rf

a vector of size (n.mrk - 1) containing a sequence of recombination fraction between the adjacent markers in the map

seq.ph

linkage phase configuration for all markers in both parents

loglike

the hmm-based multipoint likelihood

Author(s)

Marcelo Mollinari, mmollin@ncsu.edu

References

Mollinari, M., and Garcia, A. A. F. (2019) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models, _G3: Genes, Genomes, Genetics_. https://doi.org/10.1534/g3.119.400378

Examples

    mrk.subset <- make_seq_mappoly(hexafake, 1:10)
    red.mrk <- elim_redundant(mrk.subset)
    unique.mrks <- make_seq_mappoly(red.mrk)
    subset.pairs <- est_pairwise_rf(input.seq = unique.mrks,
                                  ncpus = 1,
                                  verbose = TRUE)

    ## Estimating subset map with a low tolerance for the E.M. procedure
    ## for CRAN testing purposes
    subset.map <- est_rf_hmm(input.seq = unique.mrks,
                             thres = 2,
                             twopt = subset.pairs,
                             verbose = TRUE,
                             tol = 0.1,
                             est.given.0.rf = FALSE)
    subset.map
    ## linkage phase configuration with highest likelihood
    plot(subset.map, mrk.names = TRUE, config = "best")
    ## the second one
    plot(subset.map, mrk.names = TRUE, config = 2)


mmollina/MAPpoly documentation built on March 9, 2024, 2:52 a.m.