marker_h2_means: Compute a marker-based estimate of heritability, given...

View source: R/marker_h2_means.r

marker_h2_meansR Documentation

Compute a marker-based estimate of heritability, given genotypic means.

Description

Given a genetic relatedness matrix and genotypic means, this function computes REML-estimates of the genetic and residual variance and their standard errors, using the AI-algorithm (Gilmour et al. 1995). Based on this, heritability estimates and confidence intervals are given (the estimator h_m^2 in Kruijer et al.).

Usage

marker_h2_means(data.vector, geno.vector, K, Dm=NULL, alpha = 0.05, eps = 1e-06,
       max.iter = 100, fix.h2 = FALSE, h2 = 0.5, grid.size=99)

Arguments

data.vector

A vector of phenotypic observations, typically genotypic means. Needs to be of type numeric. May contain missing values.

geno.vector

A vector of genotype labels, either a factor or character. This vector should correspond to data.vector, and hence needs to be of the same length.

K

A genetic relatedness or kinship matrix, typically marker-based. Must have row- and column-names corresponding to the levels of geno.vector

Dm

Covariance of the genotypic means contained in data.vector; see details. Should be of class matrix, with row- and column-names corresponding to the levels of geno.vector

alpha

Confidence level, for the 1-alpha confidence intervals.

eps

Numerical precision, used as convergence criterion in the AI-algorithm.

max.iter

Maximal number of iterations in the AI-algorithm.

fix.h2

Compute the log-likelihood and inverse AI-matrix for a fixed heritability value. Default is FALSE.

h2

When fix.h2 is TRUE, the value of the heritability. Must be of type numeric, between 0 and 1.

grid.size

If the AI-algorithm has not converged after max.iter iterations, the likelihood is computed on the grid of heritability values 1/(grid.size+1),...,grid.size/(grid.size+1); see details.

Details

  • Given phenotypic observations Y_{i} for genotypes i=1,...,n, the mixed model Y_{i} = \mu + G_i + E_{i} is assumed. Typically, the Y_{i} are genotypic means or BLUEs obtained from fitting a linear (mixed) model to the raw data, containing several plants or plots for each genotype. The vector of additive genetic effects (G_1,...,G_n)' follows a multivariate normal distribution with mean zero and covariance \sigma_A^2 K, where \sigma_A^2 is the additive genetic variance, and K is a genetic relatedness matrix derived from a dense set of markers. The vector of errors (E_1,...,E_n)' follows a multivariate normal distribution with mean zero and covariance \sigma_E^2 D_m, where D_m is the covariance of the means obtained from the initial analysis. In case of a completely randomized design with r_i replicates for genotypes i=1,...,n, D_m is diagonal with elements 1 / r_i. Under certain assumptions (see Speed et al. 2012) the marker- or chip-heritability h^2 = \sigma_A^2 / (\sigma_A^2 + \sigma_E^2) equals the narrow-sense heritability.

  • As in the marker_h2 function, it is assumed that the genetic relatedness matrix K is scaled such that trace(P K P) = n - 1, where P is the projection matrix I_n - 1_n 1_n' / n, for the identity matrix I_n and 1_n being a column vector of ones. If this is not the case, K is automatically scaled prior to fitting the mixed model.

  • No covariates can be included, as it is assumed that these are available at plant- or plot level, and accounted for in the genotypic means.

  • The resulting heritability estatimes are less accurate than those obtained from individual plant or plot data, and the likelihood can be monotone in h^2 = \sigma_A^2 / (\sigma_A^2 + \sigma_E^2). If the AI-algorithm has not converged after max.iter iterations, the likelihood is computed on the grid of heritability values 1/(grid.size+1),...,grid.size/(grid.size+1)

  • As in the marker_h2 function, confidence intervals for heritability are constructed using the delta-method and the inverse AI-matrix. The delta-method can be applied either directly to the function (\sigma_A^2,\sigma_E^2) -> \sigma_A^2 / (\sigma_A^2 + \sigma_E^2) or to the function (\sigma_A^2,\sigma_E^2) -> log(\sigma_A^2 / \sigma_E^2). In the latter case, a confidence interval for log(\sigma_A^2 / \sigma_E^2) is obtained, which is back-transformed to a confidence interval for heritability. This approach (proposed in Kruijer et al.) has the advantage that intervals are always contained in the unit interval.

Value

A list with the following components:

  • va: REML-estimate of the (additive) genetic variance.

  • ve: REML-estimate of the residual variance.

  • h2: Plug-in estimate of heritability: va / (va + ve).

  • conf.int1: 1-alpha confidence interval for heritability.

  • conf.int2: 1-alpha confidence interval for heritability, obtained by application of the delta method on a logarithmic scale.

  • inv.ai: The inverse of the average information (AI) matrix.

  • loglik: The log-likelihood.

  • loglik.vector: Empty numeric vector if the AI-algorthm converged within max.iter iterations. Otherwise it contains the log-likelihood on a grid.

Author(s)

Willem Kruijer.

References

  • Gilmour et al. Gilmour, A.R., R. Thompson and B.R. Cullis (1995) Average Information REML: An Efficient Algorithm for Variance Parameter Estimation in Linear Mixed Models. Biometrics, volume 51, number 4, 1440-1450.

  • Kruijer, W. et al. (2015) Marker-based estimation of heritability in immortal populations. Genetics, Vol. 199(2), p. 1-20.

  • Speed, D., G. Hemani, M. R. Johnson, and D.J. Balding (2012) Improved heritability estimation from genome-wide snps. the American journal of human genetics 91: 1011-1021.

See Also

For marker-based estimation of heritability using individual plant or plot data, see marker_h2.

Examples

data(means_LDV)
data(R_matrix_LDV)
data(K_atwell)
out <- marker_h2_means(data.vector=means_LDV$LDV,geno.vector=means_LDV$genotype,
                       K=K_atwell,Dm=R_matrix_LDV)
# Takes about a minute:
#data(means_LD)
#data(R_matrix_LD)
#out <- marker_h2_means(data.vector=means_LD$LD,geno.vector=means_LD$genotype,
#                       K=K_atwell,Dm=R_matrix_LD)
# The likelihood is monotone increasing:
#plot(x=(1:99)/100,y=out$loglik.vector,type="l",ylab="log-likelihood",lwd=2,
#     main='',xlab='h2',cex.lab=2,cex.axis=2.5)

heritability documentation built on Aug. 24, 2023, 9:08 a.m.