within_clust_sort: within_clust_sort

View source: R/functions_clusteringKmeans.R

within_clust_sortR Documentation

within_clust_sort

Description

Without modifying cluster assignments, modify the order of rows within each cluster based on within_order_strategy.

Usage

within_clust_sort(
  clust_dt,
  row_ = "id",
  column_ = "x",
  fill_ = "y",
  facet_ = "sample",
  cluster_ = "cluster_id",
  within_order_strategy = c("hclust", "sort", "left", "right")[2],
  clustering_col_min = -Inf,
  clustering_col_max = Inf,
  dcast_fill = NA
)

Arguments

clust_dt

data.table output from ssvSignalClustering

row_

variable name mapped to row, likely id or gene name for ngs data. Default is "id" and works with ssvFetch* output.

column_

varaible mapped to column, likely bp position for ngs data. Default is "x" and works with ssvFetch* output.

fill_

numeric variable to map to fill. Default is "y" and works with ssvFetch* output.

facet_

variable name to facet horizontally by. Default is "sample" and works with ssvFetch* output. Set to "" if data is not facetted.

cluster_

variable name to use for cluster info. Default is "cluster_id".

within_order_strategy

one of "hclust", "sort", "right", "left", "reverse". If "hclust", hierarchical clustering will be used. If "sort", a simple decreasing sort of rosSums. If "left", will atttempt to put high signal on left ("right" is opposite). If "reverse" reverses existing order (should only be used after meaningful order imposed).

clustering_col_min

numeric minimum for col range considered when clustering, default in -Inf

clustering_col_max

numeric maximum for col range considered when clustering, default in Inf

dcast_fill

value to supply to dcast fill argument. default is NA.

Details

This is particularly useful when you want to sort within each cluster by a different variable from cluster assignment. Also if you've imported cluster assigments but want to sort within each for the new data for a prettier heatmap.

TODO refactor shared code with clusteringKmeansNestedHclust

Value

data.table matching input clust_dt save for the reassignment of levels of row_ variable.

Examples

#clustering by relative value per region does a good job highlighting changes
#however, when then plotting raw values the order within clusters is not smooth
#this is a good situation to apply a separate sort within clusters.
prof_dt = CTCF_in_10a_profiles_dt
prof_dt = append_ynorm(prof_dt)
prof_dt[, y_relative := y_norm / max(y_norm), list(id)]

clust_dt = ssvSignalClustering(prof_dt, fill_ = "y_relative")
clust_dt.sort = within_clust_sort(clust_dt)

cowplot::plot_grid(
  ssvSignalHeatmap(clust_dt) + labs(title = "clustered by relative, sorted by relative"),
  ssvSignalHeatmap(clust_dt.sort) + labs(title = "clustered by relative, sorted by raw value")
)


jrboyd/seqsetvis documentation built on March 17, 2024, 3:14 p.m.