clustEnrichment: Cluster enrichment test
In ClueR: Cluster Evaluation

clustEnrichment

R Documentation

Cluster enrichment test

Description

Takes a clustering object generted by cmeans or kmeans algorithm and determine the enrichment of each cluster and then the overall enrichment of this clustering object based on an annotation file.

Usage

clustEnrichment(
  clustObj,
  annotation,
  effectiveSize,
  pvalueCutoff = 0.05,
  universe = NULL
)

Arguments

`clustObj`	the clustering object generated by cmeans or kmeans.
`annotation`	a list with names correspond to kinases and elements correspond to substrates belonging to each kinase.
`effectiveSize`	the size of kinase-substrate groups to be considered for calculating enrichment. Groups that are too small or too large will be removed from calculating overall enrichment of the clustering.
`pvalueCutoff`	a pvalue cutoff for determining which kinase-substrate groups to be included in calculating overall enrichment of the clustering.
`universe`	the universe of genes/proteins/phosphosites etc. that the enrichment is calculated against.

Value

a list that contains both the p-value indicating the overall enrichment and a sublist that details the enrichment of each individual cluster.

Examples

# simulate a time-series data with six distinctive profile groups and each group with
# a size of 500 phosphorylation sites.
simuData <- temporalSimu(seed=1, groupSize=500, sdd=1, numGroups=4)

# create an artificial annotation database. Generate 100 kinase-substrate groups each
# comprising 50 substrates assigned to a kinase. 
# among them, create 5 groups each contains phosphorylation sites defined to have the
# same temporal profile.   
kinaseAnno <- list()
groupSize <- 500
for (i in 1:5) {
 kinaseAnno[[i]] <- paste("p", (groupSize*(i-1)+1):(groupSize*(i-1)+50), sep="_")
}
   
for (i in 6:100) {
 set.seed(i)
 kinaseAnno[[i]] <- paste("p", sample.int(nrow(simuData), size = 50), sep="_")
}
names(kinaseAnno) <- paste("KS", 1:100, sep="_")

# testing enrichment of clustering results by partition the data into six clusters
# using cmeans algorithm.
clustObj <- e1071::cmeans(simuData, centers=6, iter.max=50, m=1.25)
clustEnrichment(clustObj, annotation=kinaseAnno, effectiveSize=c(5, 100), pvalueCutoff=0.05)

ClueR documentation built on Nov. 16, 2023, 5:08 p.m.