marker_h2: Compute a marker-based estimate of heritability, given...

View source: R/marker_h2.R

marker_h2R Documentation

Compute a marker-based estimate of heritability, given phenotypic observations at individual plant or plot level.

Description

Given a genetic relatedness matrix and phenotypic observations at individual plant or plot level, this function computes REML-estimates of the genetic and residual variance and their standard errors, using the AI-algorithm (Gilmour et al. 1995). Based on this, heritability estimates and confidence intervals are given (the estimator h_r^2 in Kruijer et al.).

Usage

marker_h2(data.vector, geno.vector, covariates = NULL, K, alpha = 0.05,
          eps = 1e-06, max.iter = 100, fix.h2 = FALSE, h2 = 0.5)

Arguments

data.vector

A vector of phenotypic observations. Needs to be of type numeric. May contain missing values.

geno.vector

A vector of genotype labels, either a factor or character. This vector should correspond to data.vector, and hence needs to be of the same length.

covariates

A data-frame or matrix with optional covariates, the rows corresponding to the phenotypic observations in data.vector and geno.vector. May contain missing values. Factors are not allowed, and need to be encoded by columns of type numeric or integer. The data-frame or matrix should not contain an intercept, which is included by default.

K

A genetic relatedness or kinship matrix, typically marker-based. Must have row- and column-names corresponding to the levels of geno.vector

alpha

Confidence level, for the 1-alpha confidence intervals.

eps

Numerical precision, used as convergence criterion in the AI-algorithm.

max.iter

Maximal number of iterations in the AI-algorithm.

fix.h2

Compute the log-likelihood and inverse AI-matrix for a fixed heritability value. Default is FALSE.

h2

When fix.h2 is TRUE, the value of the heritability. Must be of type numeric, between 0 and 1.

Details

  • Given phenotypic observations Y_{ij} for genotypes i=1,...,n and replicates j = 1,...,n_i, the mixed model Y_{ij} = \mu + G_i + E_{ij} is assumed. The vector of additive genetic effects (G_1,...,G_n)' follows a multivariate normal distribution with mean zero and covariance \sigma_A^2 K, where \sigma_A^2 is the additive genetic variance, and K is a genetic relatedness matrix derived from a dense set of markers. The errors E_{ij} are independent and normally distributed with variance \sigma_E^2. Under certain assumptions (see Speed et al. 2012) the marker- or chip-heritability h^2 = \sigma_A^2 / (\sigma_A^2 + \sigma_E^2) equals the narrow-sense heritability.

  • It is assumed that the genetic relatedness matrix K is scaled such that trace(P K P) = n - 1, where P is the projection matrix I_n - 1_n 1_n' / n, for the identity matrix I_n and 1_n being a column vector of ones. If this is not the case, K is automatically scaled prior to fitting the mixed model.

  • The model can optionally include a term X_{ij} \beta, where X_{ij} is the row vector with observations on k extra covariates and the vector \beta contains their effects. In this case the argument covariates should be the (N x k) matrix or data-frame with rows X_{ij} (N being the total number of observations). Observations where either Y_{ij} or any of the covariates is missing are discarded.

  • Confidence intervals for heritability are constructed using the delta-method and the inverse AI-matrix. The delta-method can be applied either directly to the function (\sigma_A^2,\sigma_E^2) -> \sigma_A^2 / (\sigma_A^2 + \sigma_E^2) or to the function (\sigma_A^2,\sigma_E^2) -> log(\sigma_A^2 / \sigma_E^2). In the latter case, a confidence interval for log(\sigma_A^2 / \sigma_E^2) is obtained, which is back-transformed to a confidence interval for heritability. This approach (proposed in Kruijer et al.) has the advantage that intervals are always contained in the unit interval.

  • The AI-algorithm is run for max.iter iterations. If by then there is no convergence a warning is printed and the current estimates are returned.

Value

A list with the following components:

  • va: REML-estimate of the (additive) genetic variance.

  • ve: REML-estimate of the residual variance.

  • h2: Plug-in estimate of heritability: va / (va + ve).

  • conf.int1: 1-alpha confidence interval for heritability.

  • conf.int2: 1-alpha confidence interval for heritability, obtained by application of the delta method on a logarithmic scale.

  • inv.ai: The inverse of the average information (AI) matrix.

  • loglik: The log-likelihood.

Author(s)

Willem Kruijer.

References

  • Gilmour et al. Gilmour, A.R., R. Thompson and B.R. Cullis (1995) Average Information REML: An Efficient Algorithm for Variance Parameter Estimation in Linear Mixed Models. Biometrics, volume 51, number 4, 1440-1450.

  • Kruijer, W. et al. (2015) Marker-based estimation of heritability in immortal populations. Genetics, Vol. 199(2), p. 1-20.

  • Speed, D., G. Hemani, M. R. Johnson, and D.J. Balding (2012) Improved heritability estimation from genome-wide snps. the American journal of human genetics 91: 1011-1021.

See Also

For marker-based estimation of heritability using genotypic means, see marker_h2_means.

Examples

data(LD)
data(K_atwell)
# Heritability estimation for all observations:
#out <- marker_h2(data.vector=LD$LD,geno.vector=LD$genotype,
#                 covariates=LD[,4:8],K=K_atwell)
# Heritability estimation for a randomly chosen subset of 20 accessions:
set.seed(123)
sub.set <- which(LD$genotype %in% sample(levels(LD$genotype),20))
out <- marker_h2(data.vector=LD$LD[sub.set],geno.vector=LD$genotype[sub.set],
                 covariates=LD[sub.set,4:8],K=K_atwell)

heritability documentation built on Aug. 24, 2023, 9:08 a.m.