snpClust | R Documentation |
Adjacency-constrained hierarchical agglomerative clustering of Single Nucleotide Polymorphisms based on Linkage Disequilibrium
snpClust(x, h = ncol(x) - 1, stats = c("R.squared", "D.prime"))
x |
either a genotype matrix of class
|
h |
band width. If not provided, |
stats |
a character vector specifying the linkage disequilibrium
measures to be calculated (using the |
Adjacency-constrained hierarchical agglomerative clustering (HAC) is HAC in which each observation is associated to a position, and the clustering is constrained so as only adjacent clusters are merged. SNPs are clustered based on their similarity as measured by the linkage disequilibrium.
In the special case where genotypes are given as input and the corresponding LD matrix has missing entries, the clustering cannot be performed. This can typically happen when there is insufficient variability in the sample genotypes. In this special case, the indices of the SNP pairs which yield missing values are returned.
If x
is of class
SnpMatrix
or matrix
,
it is assumed to be a n \times p
matrix of p
genotypes for
n
individuals. This input is converted to a LD similarity matrix
using the snpStats::ld
. If x
is of class
dgCMatrix
, it is assumed to be a
(squared) LD matrix.
Clustering on a LD similarity other than "R.squared" or "D.prime" can be
performed by providing the LD values directly as argument x
. These
values are expected to be in [0,1], otherwise they are truncated to [0,1].
An object of class chac
(when no LD value is missing)
Dehman A. (2015) Spatial Clustering of Linkage Disequilibrium Blocks for Genome-Wide Association Studies, PhD thesis, Universite Paris Saclay.
Dehman, A. Ambroise, C. and Neuvial, P. (2015). Performance of a blockwise approach in variable selection using linkage disequilibrium information. *BMC Bioinformatics* 16:148.
Ambroise C., Dehman A., Neuvial P., Rigaill G., and Vialaneix N (2019). Adjacency-constrained hierarchical clustering of a band similarity matrix with application to genomics, Algorithms for Molecular Biology 14(22)"
adjClust
ld
## a very small example
if (requireNamespace("snpStats", quietly = TRUE)) {
data(testdata, package = "snpStats")
# input as snpStats::SnpMatrix
fit1 <- snpClust(Autosomes[1:200, 1:5], h = 3, stats = "R.squared")
# input as base::matrix
fit2 <- snpClust(as.matrix(Autosomes[1:200, 1:5]), h = 3, stats = "R.squared")
# input as Matrix::dgCMatrix
ldres <- snpStats::ld(Autosomes[1:200, 1:5], depth = 3, stats = "R.squared", symmetric = TRUE)
fit3 <- snpClust(ldres, 3)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.