fill.geno: Fill holes in genotype data

View source: R/util.R

fill.genoR Documentation

Fill holes in genotype data

Description

Replace the genotype data for a cross with a version imputed either by simulation with sim.geno, by the Viterbi algorithm with argmax.geno, or simply filling in genotypes between markers that have matching genotypes.

Usage

fill.geno(cross, method=c("imp","argmax", "no_dbl_XO", "maxmarginal"),
          error.prob=0.0001,
          map.function=c("haldane","kosambi","c-f","morgan"),
          min.prob=0.95)

Arguments

cross

An object of class cross. See read.cross for details.

method

Indicates whether to impute using a single simulation replicate from sim.geno, using the Viterbi algorithm, as implemented in argmax.geno, by simply filling in missing genotypes between markers with matching genotypes, or by choosing (at each marker) the genotype with maximal marginal probability.

error.prob

Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype).

map.function

Indicates whether to use the Haldane, Kosambi or Carter-Falconer map function when converting genetic distances into recombination fractions.

min.prob

For method="maxmarginal", genotypes with probability greater than this value will be imputed; those less than this value will be made missing.

Details

This function is written so that one may perform rough genome scans by marker regression without having to drop individuals with missing genotype data. We must caution the user that little trust should be placed in the results.

With method="imp", a single random imputation is performed, using sim.geno.

With method="argmax", for each individual the most probable sequence of genotypes, given the observed data (via argmax.geno), is used.

With method="no_dbl_XO", non-recombinant intervals are filled in; recombinant intervals are left missing. For example, a sequence of genotypes like A---A---H---H---A (with A and H corresponding to genotypes AA and AB, respectively, and with - being a missing value) will be filled in as AAAAA---HHHHH---A.

With method="maxmarginal", the conditional genotype probabilities are calculated with calc.genoprob, and then at each marker, the most probable genotype is determined. This is taken as the imputed genotype if it has probability greater than min.prob; otherwise it is made missing.

With method="no_dbl_XO" and method="maxmarginal", some missing genotypes likely remain. With method="maxmarginal", some observed genotypes may be made missing.

Value

The input cross object with the genotype data replaced by an imputed version. Any intermediate calculations (such as is produced by calc.genoprob, argmax.geno and sim.geno) are removed.

Author(s)

Karl W Broman, broman@wisc.edu

See Also

sim.geno, argmax.geno

Examples

data(hyper)

out.mr <- scantwo(fill.geno(hyper,method="argmax"), method="mr")
plot(out.mr)

qtl documentation built on Sept. 11, 2024, 5:43 p.m.