make.main | R Documentation |
This function is to construct a design matrix of main effects from genotypic data of genetic markers. The genotypic data can include missing values. Different genetic models can be used, which transform the three-level (or two-level for a backcross) genotypic data to main-effect predictors.
make.main(geno, model = c("Cockerham", "codominant", "additive", "dominant", "recessive", "overdominant"),
fill.missing = TRUE, ind.group = NULL, geno.order = TRUE,
loci.names = c("marker", "position"), imprint = TRUE, verbose = FALSE, ...)
geno |
For human association data, it is a matrix or data frame of genotypes with dimension |
model |
a genetic model to construct main-effect predictors. |
fill.missing |
logical. If |
ind.group |
a vector of length |
geno.order |
logical. If |
loci.names |
the way to name main-effect predictors; use marker names or chromosome positions. |
imprint |
logical. Indicates whether consider imprinting effects. |
verbose |
logical. If |
This function provides different genetic models to code the main-effect predictors.
Denote common homozygote (i.e., the homozygote with higher frequency), heterozygote, and rare homozygote for each SNP by c, h, and r, respectively.
The Cockerham
model defines two main effects for each SNP (with suffix 'a' and 'd'): an additive predictor as -1, 0, and 1 for c, h, and r, and a dominance predictor as -0.5 for c and r and 0.5 for h.
The codominant
model also introduces two main effects for each SNP (with suffix 'r' and 'h'), with the two main-effect predictors being two indicator variables with the common homozygote c chosen as the reference group:
'r' and 'h' represent indicators for rare homozygote and heterozygote, respectively.
The additive
model defines a main-effect predictor for each SNP, equal to 0, 1, 2 for c, h, r, respectively.
The dominant
model defines a main-effect predictor for each SNP, equal to 1 for r and h, and 0 for c.
The recessive
model defines a main-effect predictor for each SNP, equal to 1 for r, and 0 for h and c.
The overdominant
model defines a main-effect predictor for each SNP, equal to 1 for h, and 0 for c and r.
For missing genotypes, we first calculate the genotypic probabilities of missing genotypes conditioning on the observed marker data, and then use these conditional probabilities to construct the main-effect predictors as above.
For QTL mapping in experimental crosses, we use the multipoint method as implemented in R/qtl
(see calc.genoprob
) and R/qtlbim
(see qb.genoprob
).
For human association data, we simply replace missing genotypes by their expected values (i.e., dosages) based only on the observed genotypes for that marker.
This function removes markers with only one genotype or more than three genotypes, and for markers with only two genotypes, always uses genotype indicator variables.
This function returns a data frame consisting of values of all main-effect predictors.
Nengjun Yi, nyi@uab.edu
read.cross
, calc.genoprob
, qb.genoprob
library(BhGLM)
x = sim.x(n=100, m=10, genotype=6:10)
geno = x[, 6:10] #get genotype data
x.g = make.main(geno=geno, model="additive", fill.missing=T)
x.g = make.main(geno=geno, model="Cockerham", fill.missing=T)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.