calc_pairwise_ld | R Documentation |
calc_pairwise_ld
calculates LD between each pair of SNPs.
calc_pairwise_ld(
x,
facets = NULL,
subfacets = NULL,
ss = FALSE,
par = FALSE,
CLD = "only",
use.ME = FALSE,
sigma = 1e-04,
window_sigma = NULL,
window_step = window_sigma * 2,
window_gaussian = TRUE,
window_triple_sigma = TRUE,
verbose = FALSE,
.prox_only = FALSE
)
x |
snpRdata. Input SNP data. Note that a SNP column containing snp position in base pairs named 'position' is required. |
facets |
character. Categorical metadata variables by which to break up
analysis. See |
subfacets |
character, default NULL. Subsets the facet levels to run. Given as a named list: list(fam = A) will run only fam A, list(fam = c("A", "B"), chr = 1) will run only fams A and B on chromosome 1. list(fam = "A", pop = "ASP") will run samples in either fam A or pop ASP, list(fam.pop = "A.ASP") will run only samples in fam A and pop ASP. |
ss |
numeric, default NULL. Number of snps to subsample. |
par |
numeric or FALSE, default FALSE. If numeric, the number of cores to use for parallel processing. |
CLD |
TRUE, FALSE, or "only", default "only". Specifies if the CLD method should be used either in addition to or instead of default methods. See details. |
use.ME |
logical, default FALSE. Specifies if the Minimization-Expectation haplotype estimation should be used. See details. |
sigma |
numeric, default 0.0001. If the ME method is used, specifies the minimum difference required between steps before haplotype frequencies are accepted. |
window_sigma |
numeric, default NULL. Size of windows in kb within which
to calculate ld values, if requested. Windows will be two times
|
window_step |
numeric or NULL, default two times |
window_gaussian |
logical, default TRUE. If TRUE, windows will be
gaussian-smoothed. Otherwise, raw averages will be returned. See
|
window_triple_sigma |
logical, default TRUE. If TRUE, |
verbose |
Logical, default FALSE. If TRUE, some progress updates will be reported. |
.prox_only |
Logical, default FALSE. Primarily for internal use. if TRUE
returns ONLY a proximity table of LD values, not a
|
Calculates pairwise linkage disequilibrium between pairs of SNPs using several different methods. By default uses the Burrow's Composite Linkage Disequilibrium method.
If cld is not "only", haplotypes are estimated either via direct count after removing all "0101" double heterozygote haplotypes (if use.ME is FALSE) or via the Minimization-Expectation method described in Excoffier, L., and Slatkin, M. (1995). Note that while the latter method is likely more accurate, it can be very slow and often produces qualitatively equivalent results, and so is not preferred during casual or preliminary analysis. Either method will calculate D', r-squared, and the p-value for that r-squared.
Since this process involves many pairwise comparisons, it can be very slow. As
an alternative, average LD values can be calculated within sliding windows
using the window_
family of arguments. This will be substantially
faster, but individual snp/snp LD values will not be returned. See
calc_smoothed_averages
for details.
In contrast, Burrow's Composite Linkage Disequilibrium (CLD) can be calculated
very quickly via the cor
function from base R.
calc_pairwise_ld
will perform this method alongside the other methods
if cld = TRUE and by itself if cld = "only". For most analyses, this will be
sufficient and much faster than the other methods. This is the default
behavior.
The data can be broken up categorically by either SNP and/or sample metadata,
as described in Facets_in_snpR
.
Heatmaps of the resulting data can be easily plotted using
plot_pairwise_ld_heatmap
.
a snpRdata object with linkage results stored in the pairwise.LD slot. Specifically, this slot will contain a list containing any LD matrices in a nested list broken down facet then by facet levels and a data.frame containing all pairwise comparisons, their metadata, and calculated statistics in long format for easy plotting.
William Hemstrom
Keming Su
Dimitri Zaykin (2004). Genetic Epidemiology
Excoffier, L., and Slatkin, M. (1995). Molecular Biology and Evolution
Lewontin (1964). Genetics
## Not run:
# not run, slow
## CLD
x <- calc_pairwise_ld(stickSNPs, facets = "chr.pop")
get.snpR.stats(x, "chr.pop", "LD")
## standard haplotype frequency estimation
x <- calc_pairwise_ld(stickSNPs, facets = "chr.pop", CLD = FALSE)
get.snpR.stats(x, "chr.pop", "LD")
## End(Not run)
# subset for specific subfacets (ASP and OPL, chromosome IX)
x <- calc_pairwise_ld(stickSNPs, facets = "chr.pop",
subfacets = list(pop = c("ASP", "OPL"),
chr = "groupIX"))
get.snpR.stats(x, "chr.pop", "LD")
## Not run:
## not run, really slow
# ME haplotype estimation
x <- calc_pairwise_ld(stickSNPs, facets = "chr.pop",
CLD = FALSE, use.ME = TRUE,
subfacets = list(pop = c("ASP", "OPL"),
chr = "groupIX"))
get.snpR.stats(x, "chr.pop", "LD")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.