renameAndOrderClusters: Rename clusters using genes and metadata
In AllenInstitute/scrattch.hicat: Hierarchical Iterative Clustering Analysis for Transcriptomic data

renameAndOrderClusters

R Documentation

Rename clusters using genes and metadata

Description

This function uses information from both the data (gene expression) and meta-data (annotation) objects to build a useful automated cluster name.

Usage

renameAndOrderClusters(
  sampleInfo,
  classNameColumn = "cluster_type_label",
  classGenes = c("GAD1", "SLC17A7", "SLC1A3"),
  classLevels = c("inh", "exc", "glia"),
  layerNameColumn = "layer_label",
  regionNameColumn = "Region_label",
  matchNameColumn = "cellmap_label",
  newColorNameColumn = "cellmap_color",
  otherColumns = NULL,
  propLayer = 0.3,
  dend = NULL,
  orderbyColumns = c("layer", "region", "topMatch"),
  includeClusterCounts = FALSE,
  includeBroadGenes = FALSE,
  broadGenes = NULL,
  includeSpecificGenes = FALSE,
  propExpr = NULL,
  medianExpr = NULL,
  propDiff = 0,
  propMin = 0.5,
  medianFC = 1,
  excludeGenes = NULL,
  sortByMedian = TRUE,
  sep = "_"
)

Arguments

`sampleInfo`	Sample information with rows as samples and columns for annotations. All samples in sampleInfo are used for renaming (so subset prior to running this function if desired). Columns must include "cluster_id", "cluster_label", and "cluster_color".
`classNameColumn`	Column name where class information is stored (e.g., inh/exc/glia), or NULL if you'd like it to be defined based on `classGenes`
`classGenes`	Set of genes for defining classes (which is ignored in this context if `classNameColumn!=NULL`). Also used if `broadClass` gene is not expressed.
`classLevels`	A vector of the levels for classes of the same length (and in the same order) as classGenes. Either include all relevant levels or set to NA for none if using `classNameColumn`
`layerNameColumn`	Column name where the (numeric) layer info is stored (NA if none)
`regionNameColumn`	Column name where the (character) region info is stored (NA if none)
`matchNameColumn`	Column name where the (character) comparison info stored (e.g., closest mapping cell type for each cell pre-calculated against a previous taxonomy; NA if none)
`newColorNameColumn`	Column name where the new cluster colors are found (e.g., color column corresponding to `matchNameColumn`). NA keeps the current colors.
`otherColumns`	Other columns to transfer to the output variable. Note that the value from a random sample in the cluster is returned, so this usually should be left as default (NULL).
`propLayer`	Proportion of cells (relative to max) must be higher than this for a cluster to be considered as expressed in a particular layer (default is 0.3).
`dend`	Dendrogram object, only used for ordering of clusters (NULL as default)
`orderbyColumns`	column names indicating the outputted cluster order (not used unless dend=NULL). Must be some combination of "layer", "region", and "topMatch" in any order (or NULL). Default is first by "layer" than "region" then "topMatch".
`includeClusterCounts`	Should the number of cells in each cluster be included in name?
`includeBroadGenes`	Should broad genes be included in the name (if so, `broadGenes` must be provided)?
`broadGenes`	List of broad genes, where the top median CPM in cluster is included in name
`includeSpecificGenes`	Should specific genes be included in the name? If TRUE, the next seven parameters are used to call `getTopMarkersByPropNew`.
`propExpr`	matrix of proportions of cells expressing a gene in each cluster (genes=rows, clusters=columns)
`medianExpr`	matrix of median expression per cluster (genes=rows, clusters=columns)
`propDiff`	Must have difference in proportion higher than this value in "on" cluster compared with each other cluster
`propMin`	Must have higher proportion in "on" cluster
`medianFC`	Must have median fold change greater than this value in "on" group vs. each other cluster
`excludeGenes`	Genes exlcuded from marker consideration (NULL by default)
`sortByMedian`	Should genes passing all filters be prioritized by median fold change (TRUE, default) or by difference in proportion between clusters (FALSE)
`sep`	Separation character for renaming (default is "_")

Details

When all options are selected, the outputed format is as follows: [cell class]_[layer range]_[broad marker gene]_[specific marker gene]_[brain region with most cells (and scaled fraction of cells)]_[best matched type from previous taxonomy]_[number of cells in cluster]. The output is a data frame with information about each cluster, including the new cluster names. updateSampDat needs to be run after renameAndOrderClusters to apply the new cluster names to each sample. If a dendrogram has already been created, the dendrogram labels will also need to be changed separately.

Value

A data frame of cluster information, which includes the new and old names, the requested variables from sampleInfo, and all the specific components of the new name. This is the required input for updateSampDat in the appropriate format.

AllenInstitute/scrattch.hicat documentation built on June 6, 2024, 5:31 a.m.

AllenInstitute/scrattch.hicat index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AllenInstitute/scrattch.hicat
Hierarchical Iterative Clustering Analysis for Transcriptomic data

renameAndOrderClusters: Rename clusters using genes and metadata
In AllenInstitute/scrattch.hicat: Hierarchical Iterative Clustering Analysis for Transcriptomic data

Rename clusters using genes and metadata

Description

Usage

Arguments

Details

Value

Related to renameAndOrderClusters in AllenInstitute/scrattch.hicat...

R Package Documentation

Browse R Packages

We want your feedback!

AllenInstitute/scrattch.hicat Hierarchical Iterative Clustering Analysis for Transcriptomic data

renameAndOrderClusters: Rename clusters using genes and metadata In AllenInstitute/scrattch.hicat: Hierarchical Iterative Clustering Analysis for Transcriptomic data

Rename clusters using genes and metadata

Description

Usage

Arguments

Details

Value

Related to renameAndOrderClusters in AllenInstitute/scrattch.hicat...

R Package Documentation

Browse R Packages

We want your feedback!

AllenInstitute/scrattch.hicat
Hierarchical Iterative Clustering Analysis for Transcriptomic data

renameAndOrderClusters: Rename clusters using genes and metadata
In AllenInstitute/scrattch.hicat: Hierarchical Iterative Clustering Analysis for Transcriptomic data