within_clust_sort: within_clust_sort
In jrboyd/seqsetvis: Set Based Visualizations for Next-Gen Sequencing Data

View source: R/functions_clusteringKmeans.R

within_clust_sort

R Documentation

within_clust_sort

Description

Without modifying cluster assignments, modify the order of rows within each cluster based on within_order_strategy.

Usage

within_clust_sort(
  clust_dt,
  row_ = "id",
  column_ = "x",
  fill_ = "y",
  facet_ = "sample",
  cluster_ = "cluster_id",
  within_order_strategy = c("hclust", "sort", "left", "right", "none", "reverse")[2],
  clustering_col_min = -Inf,
  clustering_col_max = Inf,
  dcast_fill = NA
)

Arguments

`clust_dt`	data.table output from `ssvSignalClustering`
`row_`	variable name mapped to row, likely id or gene name for ngs data. Default is "id" and works with ssvFetch* output.
`column_`	varaible mapped to column, likely bp position for ngs data. Default is "x" and works with ssvFetch* output.
`fill_`	numeric variable to map to fill. Default is "y" and works with ssvFetch* output.
`facet_`	variable name to facet horizontally by. Default is "sample" and works with ssvFetch* output. Set to "" if data is not facetted.
`cluster_`	variable name to use for cluster info. Default is "cluster_id".
`within_order_strategy`	one of "hclust", "sort", "right", "left", "reverse". If "hclust", hierarchical clustering will be used. If "sort", a simple decreasing sort of rosSums. If "left", will atttempt to put high signal on left ("right" is opposite). If "reverse" reverses existing order (should only be used after meaningful order imposed).
`clustering_col_min`	numeric minimum for col range considered when clustering, default in -Inf
`clustering_col_max`	numeric maximum for col range considered when clustering, default in Inf
`dcast_fill`	value to supply to dcast fill argument. default is NA.

Details

This is particularly useful when you want to sort within each cluster by a different variable from cluster assignment. Also if you've imported cluster assigments but want to sort within each for the new data for a prettier heatmap.

TODO refactor shared code with clusteringKmeansNestedHclust

Value

data.table matching input clust_dt save for the reassignment of levels of row_ variable.

Examples

data(CTCF_in_10a_profiles_dt)
#clustering by relative value per region does a good job highlighting changes
#when then plotting raw values the order within clusters is not smooth
#this is a good situation to apply a separate sort within clusters.
prof_dt = CTCF_in_10a_profiles_dt
prof_dt = append_ynorm(prof_dt)
prof_dt[, y_relative := y_norm / max(y_norm), list(id)]

clust_dt = ssvSignalClustering(prof_dt, fill_ = "y_relative")
clust_dt.sort = within_clust_sort(clust_dt)

cowplot::plot_grid(
  ssvSignalHeatmap(clust_dt) +
    labs(title = "clustered by relative, sorted by relative"),
  ssvSignalHeatmap(clust_dt.sort) +
    labs(title = "clustered by relative, sorted by raw value")
)

jrboyd/seqsetvis documentation built on Jan. 16, 2025, 10:25 a.m.