est_pairwise_rf: Pairwise two-point analysis

View source: R/pairwise_rf.R

est_pairwise_rfR Documentation

Pairwise two-point analysis

Description

Performs the two-point pairwise analysis between all markers in a sequence. For each pair, the function estimates the recombination fraction for all possible linkage phase configurations and associated LOD Scores.

Usage

est_pairwise_rf(
  input.seq,
  count.cache = NULL,
  count.matrix = NULL,
  ncpus = 1L,
  mrk.pairs = NULL,
  n.batches = 1L,
  est.type = c("disc", "prob"),
  verbose = TRUE,
  memory.warning = TRUE,
  parallelization.type = c("PSOCK", "FORK"),
  tol = .Machine$double.eps^0.25,
  ll = FALSE
)

Arguments

input.seq

an object of class mappoly.sequence

count.cache

an object of class cache.info containing pre-computed genotype frequencies, obtained with cache_counts_twopt. If NULL (default), genotype frequencies are internally loaded.

count.matrix

similar to count.cache, but in matrix format. Mostly for internal use.

ncpus

Number of parallel processes (cores) to spawn (default = 1)

mrk.pairs

a matrix of dimensions 2*N, containing N pairs of markers to be analyzed. If NULL (default), all pairs are considered

n.batches

deprecated. Not available on MAPpoly 0.3.0 or higher

est.type

Indicates whether to use the discrete ("disc") or the probabilistic ("prob") dosage scoring when estimating the two-point recombination fractions.

verbose

If TRUE (default), current progress is shown; if FALSE, no output is produced

memory.warning

if TRUE, prints a memory warning if the number of markers is greater than 10000 for ploidy levels up to 4, and 3000 for ploidy levels > 4.

parallelization.type

one of the supported cluster types. This should be either PSOCK (default) or FORK.

tol

the desired accuracy. See optimize() for details

ll

will return log-likelihood instead of LOD scores. (for internal use)

Value

An object of class mappoly.twopt which is a list containing the following components:

data.name

Name of the object of class mappoly.data containing the raw data.

n.mrk

Number of markers in the sequence.

seq.num

A vector containing the (ordered) indices of markers in the sequence, according to the input file.

pairwise

A list of size choose(length(input.seq$seq.num), 2), where each element is a matrix. The rows are named in the format x-y, where x and y indicate how many homologues share the same allelic variant in parents P and Q, respectively (see Mollinari and Garcia, 2019 for notation). The first column indicates the LOD Score for the most likely linkage phase configuration. The second column shows the estimated recombination fraction for each configuration, and the third column indicates the LOD Score for comparing the likelihood under no linkage (r = 0.5) with the estimated recombination fraction (evidence of linkage).

chisq.pval.thres

Threshold used to perform the segregation tests.

chisq.pval

P-values associated with the performed segregation tests.

Author(s)

Marcelo Mollinari, mmollin@ncsu.edu

References

Mollinari, M., and Garcia, A. A. F. (2019) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models, _G3: Genes, Genomes, Genetics_. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1534/g3.119.400378")}

Examples

  ## Tetraploid example (first 50 markers) 
  all.mrk <- make_seq_mappoly(tetra.solcap, 1:50)
  red.mrk <- elim_redundant(all.mrk)
  unique.mrks <- make_seq_mappoly(red.mrk)
  all.pairs <- est_pairwise_rf(input.seq = unique.mrks,
                               ncpus = 1, 
                               verbose = TRUE)
   all.pairs
   plot(all.pairs, 20, 21)
   mat <- rf_list_to_matrix(all.pairs)
   plot(mat)

mmollina/MAPpoly documentation built on March 9, 2024, 2:52 a.m.