getExpressionProfiles: CAGE data based expression clustering

Description Usage Arguments Details Value Author(s) References See Also Examples


Performs clustering of CAGE derived expression across multiple experiments, both at level of individual TSSs or entire clusters of TSSs.


getExpressionProfiles(object, what, tpmThreshold = 5, 
                      nrPassThreshold = 1, method = "som", 
                      xDim = 5, yDim = 5)



A CAGEset object


At which level should the expression clustering be done. Can be either "CTSS" to perform clustering of individual CTSSs or "consensusClusters" to perform clustering of consensus clusters. See Details.

tpmThreshold, nrPassThreshold

Only CTSSs or consensus clusters (depending on what parameter) with normalized CAGE signal >= tpmThreshold in >= nrPassThreshold experiments will be included in expression clustering.


Method to be used for expression clustering. Can be either "som" to use the self-organizing map (SOM) algorithm (Toronen et al., FEBS Letters 1999) implemented in the the som function from som package, or "kmeans" to use the K-means algorithm implemented in the kmeans function from stats package.

xDim, yDim

When method = "kmeans", xDim specifies number of clusters that will be returned by K-means algorithm and yDim is ignored. When method = "som", xDim specifies the the first and yDim the second dimension of the self-organizing map, which results in total xDim * yDim clusters returned by SOM.


Expression clustering can be done at level of individual CTSSs, in which case the feature vector used as input for clustering algorithm contains log-transformed and scaled (divided by standard deviation) normalized CAGE signal at individual TSS across multiple experiments. Only TSSs with normalized CAGE signal >= tpmThreshold in at least nrPassThreshold CAGE experiments are used for expression clustering. However, CTSSs along the genome can be spatially clustered into tag clusters for each experiment separately using the clusterCTSS function, and then aggregated across experiments into consensus clusters using aggregateTagClusters function. Once the consensus clusters have been created, expression clustering at the level of these wider genomic regions (representing entire promoters rather than individual TSSs) can be performed. In that case the feature vector used as input for clustering algorithm contains normalized CAGE signal within entire consensus cluster across multiple experiments, and threshold values in tpmThreshold and nrPassThreshold are applied to entire consensus clusters.


If what = "CTSS" the slots CTSSexpressionClusteringMethod and CTSSexpressionClasses will be occupied, and if what = "consensusClusters" the slots consensusClustersExpressionClusteringMethod and consensusClustersExpressionClasses of the provided CAGEset object will be occupied with the results of expression clustering. Labels of expression classes (clusters) can be retrieved using expressionClasses function, and elements belonging to a specific expression class can be selected using extractExpressionClass function.


Vanja Haberle


Toronen et al. (1999) Analysis of gene expression data using self-organizing maps, FEBS Letters 451:142-146.

See Also



load(system.file("data", "exampleCAGEset.RData", package="CAGEr"))

getExpressionProfiles(object = exampleCAGEset, what = "CTSS",
tpmThreshold = 50, nrPassThreshold = 1, method = "som", xDim = 3, yDim = 3)

Search within the CAGEr package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.