impute: Impute missing data for bi-allelic markers

View source: R/impute.R

imputeR Documentation

Impute missing data for bi-allelic markers

Description

Impute missing data for bi-allelic markers

Usage

impute(
  in.file,
  out.file,
  ploidy,
  method,
  geno,
  min.DP = 1,
  max.missing,
  params = NULL,
  n.core = 1
)

Arguments

in.file

VCF input file

out.file

VCF output file

ploidy

ploidy

method

One of the following: "pop","EM","RF"

geno

One of the following: "GT","DS"

min.DP

genotypes below this depth are set to missing (default=1)

max.missing

remove markers above this threshold, as proportion of population

params

list of method-specific parameters

n.core

multicore processing

Details

Assumes input file is sorted by position. Markers with no genetic variance are removed.

method="pop" imputes with the population mean for geno="DS" and population mode for geno="GT".

method="EM" uses parameter "tol" (default is 0.02, see rrBLUP A.mat documentation). Imputed values are truncated if needed to fall in the interval [0,ploidy].

method="RF" uses parameters "ntree" (default 100) for number of trees and "nflank" (default 100) for the number of flanking markers (on each side) to use as predictors. Because RF first uses EM to generate a complete dataset, parameter "tol" is also recognized.


jendelman/polyBreedR documentation built on Jan. 5, 2025, 12:13 a.m.