find_blocks: Allocate markers into linkage blocks

View source: R/find_blocks.R

find_blocksR Documentation

Allocate markers into linkage blocks

Description

Function to allocate markers into linkage blocks. This is an EXPERIMENTAL FUNCTION and should be used with caution.

Usage

find_blocks(
  input.seq,
  clustering.type = c("rf", "genome"),
  rf.limit = 1e-04,
  genome.block.threshold = 10000,
  rf.mat = NULL,
  ncpus = 1,
  ph.thres = 3,
  phase.number.limit = 10,
  error = 0.05,
  verbose = TRUE,
  tol = 0.01,
  tol.err = 0.001
)

Arguments

input.seq

an object of class mappoly.sequence.

clustering.type

if 'rf', it uses UPGMA clusterization based on the recombination fraction matrix to assemble blocks. Linkage blocks are assembled by cutting the clusterization tree at rf.limit. If 'genome', it splits the marker sequence at neighbor markers morre than 'genome.block.threshold' apart.

rf.limit

the maximum value to consider linked markers in case of 'clustering.type = rf'

genome.block.threshold

the threshold to assume markers are in the same linkage block. to be considered when allocating markers into blocks in case of 'clustering.type = genomee'

rf.mat

an object of class mappoly.rf.matrix.

ncpus

Number of parallel processes to spawn

ph.thres

the threshold used to sequentially phase markers. Used in thres.twopt and thres.hmm. See est_rf_hmm_sequential for details.

phase.number.limit

the maximum number of linkage phases of the sub-maps. The default is 10. See est_rf_hmm_sequential for details.

error

the assumed global genotyping error rate. If NULL (default) it does not include an error in the block estimation.

verbose

if TRUE (default), the current progress is shown; if FALSE, no output is produced.

tol

tolerance for the C routine, i.e., the value used to evaluate convergence.

tol.err

tolerance for the C routine, i.e., the value used to evaluate convergence, including the global genotyping error in the model.

Value

a list containing 1: a list of blocks in form of mappoly.map objects; 2: a vector containing markers that were not included into blocks.

Author(s)

Marcelo Mollinari, mmollin@ncsu.edu

Examples

  ## Not run: 
  ## Selecting 50 markers in chromosome 5
  s5 <- make_seq_mappoly(tetra.solcap, "seq5")
  s5 <- make_seq_mappoly(tetra.solcap, s5$seq.mrk.names[1:50])
  tpt5 <- est_pairwise_rf(s5)
  m5 <- rf_list_to_matrix(tpt5, 3, 3)
  fb.rf <- find_blocks(s5, rf.mat = m5, verbose = FALSE, ncpus = 2)
  bl.rf <- fb.rf$blocks
  plot_map_list(bl.rf)
  
  ## Merging resulting maps
  map.merge <- merge_maps(bl.rf, tpt5)
  plot(map.merge, mrk.names = T)
  
  ## Comparing linkage phases with pre assembled map
  id <- na.omit(match(map.merge$info$mrk.names, solcap.err.map[[5]]$info$mrk.names))
  map.orig <- get_submap(solcap.err.map[[5]], mrk.pos = id)
  p1.m<-map.merge$maps[[1]]$seq.ph$P
  p2.m<-map.merge$maps[[1]]$seq.ph$Q
  names(p1.m) <- names(p2.m) <- map.merge$info$mrk.names
  p1.o<-map.orig$maps[[1]]$seq.ph$P
  p2.o<-map.orig$maps[[1]]$seq.ph$Q
  names(p1.o) <- names(p2.o) <- map.orig$info$mrk.names
  n <- intersect(names(p1.m), names(p1.o))
  plot_compare_haplotypes(4, p1.o[n], p2.o[n], p1.m[n], p2.m[n])
  
  ### Using genome
  fb.geno <- find_blocks(s5, clustering.type = "genome", genome.block.threshold = 10^4)
  plot_map_list(fb.geno$blocks)
  splt <- lapply(fb.geno$blocks, split_mappoly, 1)
  plot_map_list(splt)

## End(Not run)

mmollina/MAPPoly documentation built on March 8, 2024, 2:04 a.m.