read_geno | R Documentation |
Read marker genotype data
read_geno(
filename,
ploidy,
map,
min.minor.allele = 5,
w = 1e-05,
ped = NULL,
dominance = FALSE,
pop.file = NULL
)
filename |
Name of CSV file with marker allele dosage |
ploidy |
2,4,6,etc. (even numbers) |
map |
TRUE/FALSE |
min.minor.allele |
threshold for marker filtering (see Details) |
w |
blending parameter (see Details) |
ped |
optional, pedigree data frame with 3 or 4 columns (see Details) |
dominance |
TRUE/FALSE whether to include dominance covariance (see Details) |
pop.file |
CSV file defining populations |
When map=TRUE
, first three columns of the file are marker, chrom, position. When map=FALSE
, the first column is marker. Subsequent columns contain the allele dosage for individuals/clones, coded 0,1,2,...ploidy (fractional values are allowed). The input file for diploids can also be coded using -1,0,1 (fractional values allowed). Additive coefficients are computed by subtracting the population mean from each marker, and the additive (genomic) relationship matrix is computed as G = tcrossprod(coeff)/scale. The scale parameter ensures the mean of the diagonal elements of G equals 1 under panmictic equilibrium. Missing genotype data is replaced with the population mean.
G can be blended with the pedigree relationship matrix (A) by providing a pedigree data frame in ped
and blending parameter w
. The blended relationship matrix is H = (1-w)G + wA. The first three columns of ped
are id, parent1, parent2. Missing parents must be coded NA. An optional fourth column in binary (0/1) format can be used to indicate which ungenotyped individuals should be included in the H matrix, but this option cannot be combined with dominance. If there is no fourth column, only genotyped individuals are included. If a vector of w values is provided, the function returns a list of class_geno
objects.
If the A matrix is not used, then G is blended with the identity matrix (times the mean diagonal of G) to improve numerical conditioning for matrix inversion. The default for w is 1e-5, which is somewhat arbitrary and based on tests with the vignette dataset. The D matrix is also blended with the identity matrix using 1e-5 for numerical conditioning.
When dominance=FALSE
, non-additive effects are captured using a residual genetic effect, with zero covariance. If dominance=TRUE
, a (digenic) dominance covariance matrix is used instead.
The argument min.minor.allele
specifies the minimum number of individuals that must contain the minor allele. Markers that do not meet this threshold are discarded.
Optional argument pop.file
gives the name of a CSV file with two columns: id,pop. If the populations have different ploidy, this is indicated using a named vector for ploidy
.
Variable of class class_geno
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.