QC.mppData | R Documentation |
mppData
objectsPerform different operations of quality control (QC) on the marker data of an
mppData
object.
QC.mppData(
mppData,
mk.miss = 0.1,
gen.miss = 0.25,
n.lim = 15,
MAF.pop.lim = 0.05,
MAF.cr.lim = NULL,
MAF.cr.miss = TRUE,
MAF.cr.lim2 = NULL,
verbose = TRUE,
n.cores = 1
)
mppData |
An object of class |
mk.miss |
|
gen.miss |
|
n.lim |
|
MAF.pop.lim |
|
MAF.cr.lim |
|
MAF.cr.miss |
|
MAF.cr.lim2 |
|
verbose |
|
n.cores |
|
The different operations of the quality control are the following:
Remove markers with more than two alleles.
Remove markers that are monomorphic or fully missing in the parents.
Remove markers with a missing rate higher than mk.miss
.
Remove genotypes with more missing markers than gen.miss
.
Remove crosses with less than n.lim
genotypes.
Keep only the most polymorphic marker when multiple markers map at the same position.
Check marker minor allele frequency (MAF). Different strategy can be used to control marker MAF:
A) A first possibility is to filter marker based on MAF at the whole population
level using MAF.pop.lim
, and/or on MAF within crosses using
MAF.cr.lim
.
The user can give the its own vector of critical values for MAF within cross
using MAF.cr.lim
. By default, the within cross MAF values are defined
by the following function of the cross-size n.ci: MAF(n.ci) = 0.5 if n.ci c
[0, 10] and MAF(n.ci) = (4.5/n.ci) + 0.05 if n.ci > 10. This means that up
to 10 genotypes, the critical within cross MAF is set to 50
decreases when the number of genotype increases until 5
If the within cross MAF is below the limit in at least one cross, then marker
scores of the problematic cross are either put as missing
(MAF.cr.miss = TRUE
) or the whole marker is discarded
(MAF.cr.miss = FALSE
). By default, MAF.cr.miss = TRUE
which
allows to include a larger number of markers and to cover a wider genetic
diversity.
B) An alternative is to select only markers that segregate in at least
on cross at the MAF.cr.lim2
rate.
a filtered mppData
object containing the the same elements
as create.mppData
after filtering. It contains also the
following new elements:
geno.id |
|
ped.mat |
Four columns |
geno.par.clu |
Parent marker matrix without monomorphic or completely missing markers. |
haplo.map |
Genetic map corresponding to the list of marker of the
|
parents |
List of parents. |
n.cr |
Number of crosses. |
n.par |
Number of parents. |
rem.mk |
Vector of markers that have been removed. |
rem.geno |
Vector of genotypes that have been removed. |
Vincent Garin
create.mppData
data(mppData_init)
mppData <- QC.mppData(mppData = mppData_init, n.lim = 15, MAF.pop.lim = 0.05,
MAF.cr.miss = TRUE, mk.miss = 0.1,
gen.miss = 0.25, verbose = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.