kinship: Relatedness based on pedigree or marker data
In synbreed: Framework for the Analysis of Genomic Prediction Data using R

Description Usage Arguments Details Value Author(s) References See Also Examples

This function implements different measures of relatedness between individuals in an object of class gpData: (1) Expected relatedness based on pedigree and (2) realized relatedness based on marker data. See 'Details'. The function uses as first argument an object of class gpData. An argument ret controls the type of relatedness coefficient.

1
2
3

kin(gpData, ret=c("add","kin","dom","gam","realized","realizedAB",
                  "sm","sm-smin","gaussian"),
            DH=NULL, maf=NULL, selfing=NULL, lambda=1, P=NULL, cores=1)

`gpData`	object of class `gpData`
`ret`	`character`. The type of relationship matrix to be returned. See 'Details'.
`DH`	`logical` vector of length n. `TRUE` or 1 if individual is a doubled-haploid (DH) line and `FALSE` or 0 otherwise. This option is only used, if `ret` argument is `"add"` or `"kin"`.
`maf`	`numeric` vector of length equal the number of markers. Supply values for the pi of each marker, which were used to correct the allele counts in `ret="realized"` and `ret="realizedAB"`. If not specified, pi equals the minor allele frequency of each locus.
`selfing`	`numeric` vector of length n. It is used as the number of selfings of an recombinant inbred line individual. Be awere, that this should only be used for single seed descendants This option is only used, if `ret` argument is `"add"` or `"kin"`.
`lambda`	`numeric` bandwidth parameter for the gaussian kernel. Only used for calculating the gaussian kernel.
`P`	`numeric` matrix of the same dimension as `geno` of the `gpData` object. This option can be used for own allelefrequencies of different groups in the genotypes.
`cores`	`numeric`. Here you can specify the number of cores you like to use.

Pedigree based relatedness (return arguments "add", "kin", "dom", and "gam")

Function kin provides different types of measures for pedigree based relatedness. An element pedigree must be available in the object of class gpData. In all cases, the first step is to build the gametic relationship. The gametic relationship is of order 2n as each individual has two alleles (e.g. individual A has alleles A1 and A2). The gametic relationship is defined as the matrix of probabilities that two alleles are identical by descent (IBD). Note that the diagonal elements of the gametic relationship matrix are 1. The off-diagonals of individuals with unknown or unrelated parents in the pedigree are 0. If ret="gam" is specified, the gametic relationship matrix constructed by pedigree is returned.

The gametic relationship matrix can be used to construct other types of relationship matrices. If ret="add", the additive numerator relationship matrix is returned. The additive relationship of individuals A (alleles A1,A2) and B (alleles B1,B2) is given by the entries of the gametic relationship matrix

0.5*[(A1,B1) + (A1,B2) + (A2,B1) + (A2,B2)],

where (A1,B1) denotes the element [A1,B1] in the gametic relationship matrix. If ret="kin", the kinship matrix is returned which is half of the additive relationship matrix.

If ret="dom", the dominance relationship matrix is returned. The dominance relationship matrix between individuals A (A1,A2) and B (B1,B2) in case of no inbreeding is given by

[(A1,B1) * (A2,B2) + (A1,B2) * (A2,B1)],

where (A1,C1) denotes the element [A1,C1] in the gametic relationship matrix.

Marker based relatedness (return arguments "realized","realizedAB", "sm", and "sm-smin")

Function kin provides different types of measures for marker based relatedness. An element geno must be available in the object of class gpData. Furthermore, genotypes must be coded by the number of copies of the minor allele, i.e. function codeGeno must be applied in advance.

If ret="realized", the realized relatedness between individuals is computed according to the formulas in Habier et al. (2007) or vanRaden (2008)

ZZ'/(2∑ pi(1-pi))

where Z=W-P, W is the marker matrix, P contains the allele frequencies multiplied by 2, pi is the allele frequency of marker i, and the sum is over all loci.

If ret="realizedAB", the realized relatedness between individuals is computed according to the formula in Astle and Balding (2009)

1/M sum((wi-2pi)(wi-2pi)'/(2pi(1-pi)))

where wi is the marker genotype, pi is the allele frequency at marker locus i, and M is the number of marker loci, and the sum is over all loci.

If ret="sm", the realized relatedness between individuals is computed according to the simple matching coefficient (Reif et al. 2005). The simple matching coefficient counts the number of shared alleles across loci. It can only be applied to homozygous inbred lines, i.e. only genotypes 0 and 2. To account for loci that are alike in state but not identical by descent (IBD), Hayes and Goddard (2008) correct the simple matching coefficient by the minimum of observed simple matching coefficients

s-smin/(1-smin)

where s is the matrix of simple matching coefficients. This formula is used with argument ret="sm-smin".

If ret="gaussian", the euklidian distances distEuk for all individuals are calculated. The values of distEuk are than used to calculate similarity coefficients between the individuals with exp(distEuk^2/numMarker). Be aware that this relationship matrix scales theoretically between 0 and 1!

An object of class "relationshipMatrix".

Valentin Wimmer and Theresa Albrecht, with contributions by Yvonne Badke

Habier D, Fernando R, Dekkers J (2007). The Impact of Genetic Relationship information on Genome-Assisted Breeding Values. Genetics, 177, 2389 – 2397.

vanRaden, P. (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91:4414 – 4423.

Astle, W., and D.J. Balding (2009). Population Structure and Cryptic Relatedness in Genetic Association Studies. Statistical Science, 24(4), 451 – 471.

Reif, J.C.; Melchinger, A. E. and Frisch, M. Genetical and mathematical properties of similarity and dissimilarity coefficients applied in plant breeding and seed bank management. Crop Science, January-February 2005, vol. 45, no. 1, p. 1-7.

Rogers, J., 1972 Measures of genetic similarity and genetic distance. In Studies in genetics VII, volume 7213. Univ. of Texas, Austin

Hayes, B. J., and M. E. Goddard. 2008. Technical note: Prediction of breeding values using marker derived relationship matrices. J. Anim. Sci. 86

plot.relationshipMatrix

#=========================
# (1) pedigree based relatedness
#=========================
## Not run: 
library(synbreedData)
data(maize)
K <- kin(maize,ret="kin")
plot(K)

## End(Not run)

#=========================
# (2) marker based relatedness
#=========================
## Not run: 
data(maize)
U <- kin(codeGeno(maize),ret="realized")
plot(U)

## End(Not run)


### Example for Legarra et al. (2009), J. Dairy Sci. 92: p. 4660
id <- 1:17
par1 <- c(0,0,0,0,0,0,0,0,1,3,5,7,9,11,4,13,13)
par2 <- c(0,0,0,0,0,0,0,0,2,4,6,8,10,12,11,15,14)
ped <- create.pedigree(id,par1,par2)
gp <- create.gpData(pedigree=ped)

# additive relationship
A <- kin(gp,ret="add")
# dominance relationship
D <- kin(gp,ret="dom")