knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

1. What is A-clustering?

A-clustering is a statistical method .... [@sofer2013clustering].

Here is the link for more detailed explanation of the A-clustering method and Aclust R package.

2. Input Parameters for A-clustering

Y \eqn{q} by \eqn{n} outcome data matrix, where \eqn{n} is sample size and \eqn{q} is number of outcomes.

annot A preloaded annotation file that includes columns "IlmnID", "CHR", "Coordinate_37", "Islands_Name", "Relation_to_Island", "UCSC_RefGene_Name".

dist.type A type of similarity distance function. Options are "spearman" (default), "pearson" (correlation measures) or "euclid".

Aclust.method A type of clustering function. Options are "single", "complete" or "average" (default).

dist.thresh A similarity distance threshold. Two neighboring clusters are merged to a single cluster if the similarity distance between them is above dist.thresh. Corresponds to $\bar{D}$ in the A-clustering paper paper and the default is $0.2$

bp.thresh.clust Optional maximum length between neighboring variables permitting to cluster them together. Corresponds to $\bar{d}_{bp}$ in the A-clustering paper paper and the default is $1000$.

bp.merge A distance in chromosomal location. Any set of methylation sites within an interval smaller or equal to bp.dist will be potentially merged, depending on the similarity between sites at the ends of the interval. Corresponds to $\underline{d}_{bp}$ in the A-clustering paper and the default is $999$.

3. Implementation of A-clustering

# Load annotation file
data(annot) # row: CpG sites
# Load sample data
data(sample.data)
DATA.X <- sample.data$DATA.X # row: subjects (n), column: exposures (p)
DATA.Y <- sample.data$DATA.Y # row: subjects (n), column: CpG sites (q)

# Settings for Aclust
dist.type <- "spearman"
Aclust.method <- "average"
dist.thresh <- 0.2
bp.thresh.clust <- 1000
bp.merge <- 999

# Implement A-clustering
all.clusters.list <- Aclust::assign.to.clusters(betas = t(DATA.Y),
                                                annot = annot,
                                                dist.type = dist.type,
                                                method = Aclust.method,
                                                dist.thresh = dist.thresh,
                                                bp.thresh.clust = bp.thresh.clust,
                                                bp.merge = bp.merge)
# We only need clusters with at least two probes
clusters.list <- all.clusters.list[sapply(all.clusters.list,length)!=1]
save(clusters.list,"clusters.list.RData") 
# identical(clusters.list,sample.data$clusters.list) TRUE

References



jennyjyounglee/AclustsCCA documentation built on June 15, 2022, 7:45 p.m.