LDnaRaw | R Documentation |
Takes a lower diagonal matrix of pairwise LD values and produces data files for subsequent LD network analyses. This version works in conjunction with extractBranches
.
LDnaRaw(
LDmat,
digits = 2,
method = "single",
mc.cores = NULL,
length_out = NULL,
fun = function(x) {
min(x, na.rm = TRUE)
}
)
LDmat |
Lower diagonal matrix of pairwise LD values, |
digits |
Number of digits for rounding r2 values. |
method |
Specifies clustering medhod (see |
mc.cores |
Specifies number of cores for |
length_out |
Specifies the number of maximum number of possible node depths. Overrides |
fun |
Only used internally |
In LDna clusters represent loci (vertices) connected by LD values (edges) above given thresholds. From a lower diagonal matrix of LD values a a clustering algorithim
(see hclust
) is used to produce a tree that describes cluster merging with decreasing LD threshold. In these trees nodes represent clusters/loci and node distance indicates LD threshold values at which these events occur. This version (in contrast to LDnaRaw
) does not estimate lambda as it is intended to be used with extractBranches
where |E|min (minimum number of edges) is the only parameter that defines LD-clusters (essentially determines how "thick" branches are).
The clustering method method
can be specified. For 'strict' clusters use 'single' (e.g. for outlier or inversion detection), for chromosome binning, use 'average', 'complete' or 'median'.
R2 values can be rounded to limit (digits
) number of clustering events in large data sets, but too much leads to problems when recursively collapsing (too large) polytomies.
A list with three objects: clusterfile
, stats
and a tree-file tree
. Clusterfile
is a matrix with all unique clusters as columns and loci as rows, where TRUE
indicates the presence of a locus in a specific cluster (else FALSE
is given). Stats
is a data frame that gives all edges for the single linkage clustering tree
along with the following information for each edge: cluster (focal cluster), parent cluster (cluster efter merger), and the corresponding nV
(number of loci), nE
(number of edges connecting the loci), min_LD
(minimum LD among any two loci in the cluster), merge_at_below
(the weakest link in the cluster before merger at lower thresholds), merge_at_above
(the weakest link in the cluster just before merger spliting into to or more clusters, at higher thresholds). See example below
Petri Kemppainen petrikemppainen2@gmail.com, Christopher Knight Chris.Knight@manchester.ac.uk
extractBranches
# Simple upper diagonal LD matrix
LDmat <- structure(c(NA, 0.84, 0.64, 0.24, 0.2, 0.16, 0.44, 0.44, NA, NA, 0.8, 0.28, 0.4, 0.36, 0.36, 0.24, NA, NA, NA, 0.48, 0.32, 0.2, 0.36, 0.2, NA, NA, NA, NA, 0.76, 0.56, 0.6, 0.2, NA, NA, NA, NA, NA, 0.72, 0.68, 0.24, NA, NA, NA, NA, NA, NA, 0.44, 0.24, NA, NA, NA, NA, NA, NA, NA, 0.2, NA, NA, NA, NA, NA, NA, NA, NA), .Dim = c(8L, 8L), .Dimnames = list(c("L1", "L2", "L3", "L4", "L5", "L6", "L7", "L8"), c("L1", "L2", "L3", "L4", "L5", "L6", "L7", "L8")))
# Produce raw data
ldna <- LDnaRaw(LDmat)
ldna$clusterfile
ldna$stats
# Illustrating the use of merge_at_below and merge_at_above
clusters <- extractBranches(ldna, min.edges=0, merge.min=0.8)
clusters
cl_info <- ldna$stats[cluster==names(clusters)[2]] # focus on cluster 5_0.68
cl_info
abline(v=cl_info$merge_at_below, col="blue")
abline(v=cl_info$merge_at_above, col="green")
# using multiple cores for a larger data set
data(LDna)
ldna <- LDnaRaw(r2.baimaii_subs, mc.cores=4)
clusters <- extractBranches(ldna, min.edges=20, merge.min=0.8) ## higher threshold for cluster size
# proceed to explore networks using function "plotLDnetwork"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.