impute_L2H | R Documentation |
Impute from low to high density markers by Random Forest
impute_L2H(
high.file,
low.file,
out.file = NULL,
params = list(),
exclude = NULL,
n.core = 1
)
high.file |
name of high density file |
low.file |
name of low density file |
out.file |
name of CSV output file for imputed data |
params |
list of parameters (see Details) |
exclude |
optional, vector of high density samples to exclude |
n.core |
multicore processing |
Argument params
is a list with the following options: format, model, n.tree, n.mark. format
can have values "GT" (integer dosage) or "DS" (real numbers between 0 and ploidy). model
can be "class" for classification or "regress" for regression when "GT" is used; for "DS" format, only regression is permitted. n.tree
is the number of trees (default = 100). n.mark
is the number of markers to use as predictors (default = 100), chosen based on minimum distance to the target.
The exclude
argument is useful for cross-validation.
Both VCF and CSV are allowable input file formats–they are recognized based on the file extension. For CSV, the first three columns should be marker, chrom, pos. The output file is CSV.
Any missing data are imputed separately for each input file at the outset, using the population mean (regress) or mode (class) for each marker.
matrix of OOB error with dimensions markers x trees. For regression model, it is MSE.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.