clusterRanges: Cluster Ranges

clusterRangesR Documentation

Cluster Ranges

Description

Cluster the ranges in a deepTools object based on signal within each range

Usage

clusterRanges(
  object = "profileplyr",
  fun = "function",
  scaleRows = "logical",
  kmeans_k = "integer",
  clustering_callback = "function",
  clustering_distance_rows = "ANY",
  cluster_method = "function",
  cutree_rows = "integer",
  silent = "logical",
  show_rownames = "logical",
  cluster_sample_subset = "ANY"
)

## S4 method for signature 'profileplyr'
clusterRanges(
  object = "profileplyr",
  fun = rowMeans,
  scaleRows = TRUE,
  kmeans_k = NULL,
  clustering_callback = function(x, ...) {
     return(x)
 },
  clustering_distance_rows = "euclidean",
  cluster_method = "complete",
  cutree_rows = NULL,
  silent = TRUE,
  show_rownames = FALSE,
  cluster_sample_subset = NULL
)

Arguments

object

A profileplyr object

fun

The function used to summarize the ranges (e.g. rowMeans or rowMax). This is ignored when only one sample is used for clustering; in this case the lone heatmap is clustered based the signal across the bins.

scaleRows

If TRUE, the rows of the matrix containing the signal in each bin that is used as the input for clustering will be scaled (as specified by pheatmap)

kmeans_k

The number of kmeans groups used for clustering

clustering_callback

Clustering callback function to be passed to pheatmap

clustering_distance_rows

distance measure used in clustering rows. Possible values are "correlation" for Pearson correlation and all the distances supported by dist, such as "euclidean", etc. If the value is none of the above it is assumed that a distance matrix is provided.

cluster_method

clustering method used. Accepts the same values as hclust

cutree_rows

The number of clusters for hierarchical clustering

silent

Whether or not a heatmap (from pheatmap) is shown with the output. This will not change what is returned with the function as it will always be a profileplyr object. If silent = FALSE, the heatmap will be shown which may be helpful in quick evaluation of varying numbers of clusters before proceeding with downstream analysis. The default is silent = TRUE, meaning no heatmap will be shown.

show_rownames

for any heatmaps printed while running this function, set to TRUE if rownames should be displayed. Default is FALSE.

cluster_sample_subset

Either a character or numeric vector indicating the subset of heatmaps to be used for clustering. If a character vector, all elements of the vector must match names of the samples of the profileplyr object (found with 'rownames(sampleData(object))'). For an numeric vector, the profileplyr object will be subset based on the samples that correspond to these numbers (i.e. the numeric index of that sample within 'rownames(sampleData(object))'). When only sample is chosen, the lone heatmap selected will be clustered by signal across the bins of that sample. When more than one sample are selected, the 'fun' argument will be used to summarize the ranges and cluster across these selected samples.

Details

tbd

Value

A profileplyr object

Methods (by class)

  • clusterRanges(profileplyr): Cluster Ranges

Examples

example <- system.file("extdata", "example_deepTools_MAT", package = "profileplyr") 
object <- import_deepToolsMat(example) 

# k-means clustering
clusterRanges(object, fun = rowMeans, kmeans_k = 3)

# hierarchical clustering, print heatmap, yet still return profileplyr object
clusterRanges(object, fun = rowMeans, cutree_rows = 3, silent = FALSE)
 

RockefellerUniversity/profileplyr documentation built on Jan. 28, 2023, 10:09 a.m.