calc_genoprob_fst: Calculate conditional genotype probabilities and write to fst...

View source: R/calc_genoprob_fst.R

calc_genoprob_fstR Documentation

Calculate conditional genotype probabilities and write to fst database

Description

Uses a hidden Markov model to calculate the probabilities of the true underlying genotypes given the observed multipoint marker data, with possible allowance for genotyping errors.

Usage

calc_genoprob_fst(
  cross,
  fbase,
  fdir = ".",
  map = NULL,
  error_prob = 0.0001,
  map_function = c("haldane", "kosambi", "c-f", "morgan"),
  lowmem = FALSE,
  quiet = TRUE,
  cores = 1,
  compress = 0,
  overwrite = FALSE
)

Arguments

cross

Object of class "cross2". For details, see the R/qtl2 developer guide.

fbase

Base of filename for fst database.

fdir

Directory for fst database.

map

Genetic map of markers. May include pseudomarker locations (that is, locations that are not within the marker genotype data). If NULL, the genetic map in cross is used.

error_prob

Assumed genotyping error probability

map_function

Character string indicating the map function to use to convert genetic distances to recombination fractions.

lowmem

If FALSE, split individuals into groups with common sex and crossinfo and then precalculate the transition matrices for a chromosome; potentially a lot faster but using more memory.

quiet

If FALSE, print progress messages.

cores

Number of CPU cores to use, for parallel calculations. (If 0, use parallel::detectCores().) Alternatively, this can be links to a set of cluster sockets, as produced by parallel::makeCluster().

compress

Amount of compression to use (value in the range 0-100; lower values mean larger file sizes)

overwrite

If FALSE (the default), refuse to overwrite any files that already exist.

Details

This is like calling qtl2::calc_genoprob() and then fst_genoprob(), but in a way that hopefully saves memory by doing it one chromosome at a time.

Value

A list containing the attributes of genoprob and the address for the created fst database. Components are:

  • dim - List of all dimensions of 3-D arrays.

  • dimnames - List of all dimension names of 3-D arrays.

  • is_x_chr - Vector of all is_x_chr attributes.

  • chr - Vector of (subset of) chromosome names for this object.

  • ind - Vector of (subset of) individual names for this object.

  • mar - Vector of (subset of) marker names for this object.

  • fst - Path and base of file names for the fst database.

See Also

qtl2::calc_genoprob(), fst_genoprob()

Examples

library(qtl2)
grav2 <- read_cross2(system.file("extdata", "grav2.zip", package="qtl2"))
gmap_w_pmar <- insert_pseudomarkers(grav2$gmap, step=1)
fst_dir <- file.path(tempdir(), "grav2_genoprob")
dir.create(fst_dir)
probs_fst <- calc_genoprob_fst(grav2, "grav2", fst_dir, gmap_w_pmar, error_prob=0.002)

# clean up: remove all the files we created
unlink(fst_files(probs_fst))

qtl2fst documentation built on Sept. 11, 2024, 5:31 p.m.