Description Usage Arguments Details Value Author(s) See Also Examples
Compute pairwise linkage disequilibrium between genetic markers
1 2 3 4 5 |
g1 |
genotype object or dataframe containing genotype objects |
g2 |
genotype object (ignored if g1 is a dataframe) |
... |
optional arguments (ignored) |
Linkage disequilibrium (LD) is the non-random association of marker alleles and can arise from marker proximity or from selection bias.
LD.genotype
estimates the extent of LD for a single pair of
genotypes. LD.data.frame
computes LD for all pairs of
genotypes contained in a data frame. Before starting,
LD.data.frame
checks the class and number of alleles of each
variable in the dataframe. If the data frame contains non-genotype
objects or genotypes with more or less than 2 alleles, these will be
omitted from the computation and a warning will be generated.
Three estimators of LD are computed:
D raw difference in frequency between the observed number of AB pairs and the expected number:
D = p(AB) - p(A)*p(B)
D' scaled D spanning the range [-1,1]
D' = D / Dmax
where, if D > 0:
Dmax = min( p(A)p(b), p(a)p(B) )
or if D < 0:
Dmax = max( -p(A)p(B), -p(a)p(b) )
r correlation coefficient between the markers
r = -D / sqrt( p(A) * p(a) * p(B) * p(b) )
where
- p(A) is defined as the observed probability of allele 'A' for marker 1,
- p(a) = 1-p(A) is defined as the observed probability of allele 'a' for marker 1,
-p(B) is defined as the observed probability of allele 'B' for marker 2, and
-p(b) = 1- p(B) is defined as the observed probability of allele 'b' for marker 2, and
-p(AB) is defined as the probability of the marker allele pair 'AB'.
For genotype data, AB/ab cannot be distinguished from aB/Ab. Consequently, we estimate p(AB) using maximum likelihood and use this value in the computations.
LD.genotype
returns a 5 element list:
call |
the matched call |
D |
Linkage disequilibrium estimate |
Dprime |
Scaled linkage disequilibrium estimate |
corr |
Correlation coefficient |
nobs |
Number of observations |
chisq |
Chi-square statistic for linkage equilibrium (i.e., D=D'=corr=0) |
p.value |
Chi-square p-value for marker independence |
LD.data.frame
returns a list with the same elements, but each
element is a matrix where the upper off-diagonal elements contain the
estimate for the corresponding pair of markers. The other matrix
elements are NA
.
Gregory R. Warnes greg@warnes.net
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | g1 <- genotype( c('T/A', NA, 'T/T', NA, 'T/A', NA, 'T/T', 'T/A',
'T/T', 'T/T', 'T/A', 'A/A', 'T/T', 'T/A', 'T/A', 'T/T',
NA, 'T/A', 'T/A', NA) )
g2 <- genotype( c('C/A', 'C/A', 'C/C', 'C/A', 'C/C', 'C/A', 'C/A', 'C/A',
'C/A', 'C/C', 'C/A', 'A/A', 'C/A', 'A/A', 'C/A', 'C/C',
'C/A', 'C/A', 'C/A', 'A/A') )
g3 <- genotype( c('T/A', 'T/A', 'T/T', 'T/A', 'T/T', 'T/A', 'T/A', 'T/A',
'T/A', 'T/T', 'T/A', 'T/T', 'T/A', 'T/A', 'T/A', 'T/T',
'T/A', 'T/A', 'T/A', 'T/T') )
# Compute LD on a single pair
LD(g1,g2)
# Compute LD table for all 3 genotypes
data <- makeGenotypes(data.frame(g1,g2,g3))
LD(data)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.