getExpressionProfiles: CAGE data based expression clustering

getExpressionProfilesR Documentation

CAGE data based expression clustering

Description

Clusters CAGE expression across multiple experiments, both at level of individual TSSs or entire clusters of TSSs.

Usage

getExpressionProfiles(
  object,
  what = c("CTSS", "consensusClusters"),
  tpmThreshold = 5,
  nrPassThreshold = 1,
  method = c("som", "kmeans"),
  xDim = 5,
  yDim = 5
)

## S4 method for signature 'CAGEexp'
getExpressionProfiles(
  object,
  what = c("CTSS", "consensusClusters"),
  tpmThreshold = 5,
  nrPassThreshold = 1,
  method = c("som", "kmeans"),
  xDim = 5,
  yDim = 5
)

## S4 method for signature 'matrix'
getExpressionProfiles(
  object,
  what = c("CTSS", "consensusClusters"),
  tpmThreshold = 5,
  nrPassThreshold = 1,
  method = c("som", "kmeans"),
  xDim = 5,
  yDim = 5
)

Arguments

object

A CAGEexp object

what

At which level the expression clustering is done (CTSS or consensusClusters)

tpmThreshold, nrPassThreshold

Ignore clusters when their normalized CAGE signal is lower than tpmThreshold in at least nrPassThreshold experiments.

method

Method to be used for expression clustering. som uses the self-organizing map (SOM) algorithm of Toronen and coll., FEBS Letters (1999) som::som] function from som package. kmeans uses the K-means algorithm implemented in the stats::kmeans] function.

xDim, yDim

With method = "kmeans", xDim specifies number of clusters that will be returned by K-means algorithm and yDim is ignored. With method = "som", xDim specifies the the first and yDim the second dimension of the self-organizing map, which results in total $xDim x yDim$ clusters returned by SOM.

Details

Expression clustering can be done at level of individual CTSSs, in which case the feature vector used as input for clustering algorithm contains log-transformed and scaled (divided by standard deviation) normalized CAGE signal at individual TSS across multiple experiments. Only TSSs with normalized CAGE signal ⁠>= tpmThreshold⁠ in at least nrPassThreshold CAGE experiments are used for expression clustering. However, CTSSs along the genome can be spatially clustered into tag clusters for each experiment separately using a CTSS clustering function, and then aggregated across experiments into consensus clusters using aggregateTagClusters function. Once the consensus clusters have been created, expression clustering at the level of these wider genomic regions (representing entire promoters rather than individual TSSs) can be performed. In that case the feature vector used as input for clustering algorithm contains normalized CAGE signal within entire consensus cluster across multiple experiments, and threshold values in tpmThreshold and nrPassThreshold are applied to entire consensus clusters.

Value

Returns a modified CAGEexp object. If what = "CTSS" the objects's metadata elements CTSSexpressionClusteringMethod and CTSSexpressionClasses will be set accordingly, and if what = "consensusClusters" the elements consensusClustersExpressionClusteringMethod and consensusClustersExpressionClasses will be set. Labels of expression classes (clusters) can be retrieved using expressionClasses function.

Author(s)

Vanja Haberle

Charles Plessy

References

Toronen et al. (1999) Analysis of gene expression data using self-organizing maps, FEBS Letters 451:142-146.

See Also

Other CAGEr expression clustering functions: expressionClasses(), plotExpressionProfiles()

Examples

getExpressionProfiles( exampleCAGEexp, "CTSS"
                     , tpmThreshold = 50, nrPassThreshold = 1
                     , method = "som", xDim = 3, yDim = 3)
                     
getExpressionProfiles( exampleCAGEexp, "CTSS"
                     , tpmThreshold = 50, nrPassThreshold = 1
                     , method = "kmeans", xDim = 3)

getExpressionProfiles(exampleCAGEexp, "consensusClusters")


charles-plessy/CAGEr documentation built on Oct. 27, 2024, 10:11 p.m.