In jennyjyounglee/AclustsCCA: A Cluster Based Sparse Canonical Correlation Analayis to Test Associations Between A Multi-Pollutant Mixture and High-Dimentional DNA Methylation

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

1. What is A-clustering?

A-clustering is a statistical method .... [@sofer2013clustering].

Here is the link for more detailed explanation of the A-clustering method and Aclust R package.

2. Input Parameters for A-clustering

Y \eqn{q} by \eqn{n} outcome data matrix, where \eqn{n} is sample size and \eqn{q} is number of outcomes.

annot A preloaded annotation file that includes columns "IlmnID", "CHR", "Coordinate_37", "Islands_Name", "Relation_to_Island", "UCSC_RefGene_Name".

dist.type A type of similarity distance function. Options are "spearman" (default), "pearson" (correlation measures) or "euclid".

Aclust.method A type of clustering function. Options are "single", "complete" or "average" (default).

dist.thresh A similarity distance threshold. Two neighboring clusters are merged to a single cluster if the similarity distance between them is above dist.thresh. Corresponds to $\bar{D}$ in the A-clustering paper paper and the default is $0.2$

bp.thresh.clust Optional maximum length between neighboring variables permitting to cluster them together. Corresponds to $\bar{d}_{bp}$ in the A-clustering paper paper and the default is $1000$.

bp.merge A distance in chromosomal location. Any set of methylation sites within an interval smaller or equal to bp.dist will be potentially merged, depending on the similarity between sites at the ends of the interval. Corresponds to $\underline{d}_{bp}$ in the A-clustering paper and the default is $999$.

3. Implementation of A-clustering

# Load annotation file
data(annot) # row: CpG sites
# Load sample data
data(sample.data)
DATA.X <- sample.data$DATA.X # row: subjects (n), column: exposures (p)
DATA.Y <- sample.data$DATA.Y # row: subjects (n), column: CpG sites (q)

# Settings for Aclust
dist.type <- "spearman"
Aclust.method <- "average"
dist.thresh <- 0.2
bp.thresh.clust <- 1000
bp.merge <- 999

# Implement A-clustering
all.clusters.list <- Aclust::assign.to.clusters(betas = t(DATA.Y),
                                                annot = annot,
                                                dist.type = dist.type,
                                                method = Aclust.method,
                                                dist.thresh = dist.thresh,
                                                bp.thresh.clust = bp.thresh.clust,
                                                bp.merge = bp.merge)
# We only need clusters with at least two probes
clusters.list <- all.clusters.list[sapply(all.clusters.list,length)!=1]
save(clusters.list,"clusters.list.RData") 
# identical(clusters.list,sample.data$clusters.list) TRUE

References

jennyjyounglee/AclustsCCA documentation built on June 15, 2022, 7:45 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jennyjyounglee/AclustsCCA
A Cluster Based Sparse Canonical Correlation Analayis to Test Associations Between A Multi-Pollutant Mixture and High-Dimentional DNA Methylation

In jennyjyounglee/AclustsCCA: A Cluster Based Sparse Canonical Correlation Analayis to Test Associations Between A Multi-Pollutant Mixture and High-Dimentional DNA Methylation

1. What is A-clustering?

2. Input Parameters for A-clustering

3. Implementation of A-clustering

References

R Package Documentation

Browse R Packages

We want your feedback!

jennyjyounglee/AclustsCCA A Cluster Based Sparse Canonical Correlation Analayis to Test Associations Between A Multi-Pollutant Mixture and High-Dimentional DNA Methylation

In jennyjyounglee/AclustsCCA: A Cluster Based Sparse Canonical Correlation Analayis to Test Associations Between A Multi-Pollutant Mixture and High-Dimentional DNA Methylation

1. What is A-clustering?

2. Input Parameters for A-clustering

3. Implementation of A-clustering

References

R Package Documentation

Browse R Packages

We want your feedback!

jennyjyounglee/AclustsCCA
A Cluster Based Sparse Canonical Correlation Analayis to Test Associations Between A Multi-Pollutant Mixture and High-Dimentional DNA Methylation