tof_metacluster_hierarchical: Metacluster clustered CyTOF data using hierarchical...

View source: R/metaclustering.R

tof_metacluster_hierarchicalR Documentation

Metacluster clustered CyTOF data using hierarchical agglomerative clustering

Description

This function performs hierarchical metaclustering on a 'tof_tbl' containing CyTOF data using a user-specified selection of input variables/CyTOF measurements and the number of desired metaclusters. See hclust.

Usage

tof_metacluster_hierarchical(
  tof_tibble,
  cluster_col,
  metacluster_cols = where(tof_is_numeric),
  central_tendency_function = stats::median,
  num_metaclusters = 10L,
  distance_function = c("euclidean", "manhattan", "minkowski", "maximum", "canberra",
    "binary"),
  agglomeration_method = c("complete", "single", "average", "median", "centroid",
    "ward.D", "ward.D2", "mcquitty")
)

Arguments

tof_tibble

A 'tof_tbl' or 'tibble'.

cluster_col

An unquoted column name indicating which column in 'tof_tibble' stores the cluster ids for the cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the 'tof_cluster_*' function family, or any other method.

metacluster_cols

Unquoted column names indicating which columns in 'tof_tibble' to use in computing the metaclusters. Defaults to all numeric columns in 'tof_tibble'. Supports tidyselect helpers.

central_tendency_function

The function that should be used to calculate the measurement of central tendency for each cluster before metaclustering. This function will be used to compute a summary statistic for each input cluster in 'cluster_col' across all columns specified by 'metacluster_cols', and the resulting vector (one for each cluster) will be used as the input for metaclustering. Defaults to median.

num_metaclusters

An integer indicating the number of clusters that should be returned. Defaults to 10.

distance_function

A string indicating which distance function should be used to compute the distances between clusters during the hierarchical metaclustering. Options are "euclidean" (the default), "manhattan", "minkowski", "maximum", "canberra", and "binary". See dist for additional details.

agglomeration_method

A string indicating which agglomeration algorithm should be used during hierarchical cluster combination. Options are "complete" (the default), "single", "average", "median", "centroid", "ward.D", "ward.D2", and "mcquitty". See hclust for details.

Value

A tibble with a single column ('.hierarchical_metacluster') and the same number of rows as the input 'tof_tibble'. Each entry in the column indicates the metacluster label assigned to the same row in 'tof_tibble'.

See Also

Other metaclustering functions: tof_metacluster(), tof_metacluster_consensus(), tof_metacluster_flowsom(), tof_metacluster_kmeans(), tof_metacluster_phenograph()

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = rnorm(n = 1000),
        cd34 = rnorm(n = 1000),
        cd19 = rnorm(n = 1000),
        cluster_id = sample(letters, size = 1000, replace = TRUE)
    )

tof_metacluster_hierarchical(tof_tibble = sim_data, cluster_col = cluster_id)


keyes-timothy/tidytof documentation built on Aug. 28, 2024, 8:37 a.m.