tof_metacluster_flowsom: Metacluster clustered CyTOF data using FlowSOM's built-in...

View source: R/metaclustering.R

tof_metacluster_flowsomR Documentation

Metacluster clustered CyTOF data using FlowSOM's built-in metaclustering algorithm

Description

This function performs metaclustering on a 'tof_tbl' containing CyTOF data using a user-specified selection of input variables/CyTOF measurements and the number of desired metaclusters. It takes advantage of the FlowSOM package's built-in functionality for automatically detecting the number of metaclusters and can use several strategies as adapted by the FlowSOM team: consensus metaclustering, hierarchical metaclustering, k-means metaclustering, or metaclustering using the FlowSOM algorithm itself. See MetaClustering for additional details.

Usage

tof_metacluster_flowsom(
  tof_tibble,
  cluster_col,
  metacluster_cols = where(tof_is_numeric),
  central_tendency_function = stats::median,
  num_metaclusters = 10L,
  clustering_algorithm = c("consensus", "hierarchical", "kmeans", "som"),
  ...
)

Arguments

tof_tibble

A 'tof_tbl' or 'tibble'.

cluster_col

An unquoted column name indicating which column in 'tof_tibble' stores the cluster ids for the cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the 'tof_cluster_*' function family, or any other method.

metacluster_cols

Unquoted column names indicating which columns in 'tof_tibble' to use in computing the metaclusters. Defaults to all numeric columns in 'tof_tibble'. Supports tidyselect helpers.

central_tendency_function

The function that should be used to calculate the measurement of central tendency for each cluster before metaclustering. This function will be used to compute a summary statistic for each input cluster in 'cluster_col' across all columns specified by 'metacluster_cols', and the resulting vector (one for each cluster) will be used as the input for metaclustering. Defaults to median.

num_metaclusters

An integer indicating the maximum number of clusters that should be returned. Defaults to 10. Note that for this function, the output may provide a small number of metaclusters than requested. This is because MetaClustering uses the "Elbow method" to automatically detect the optimal number of metaclusters.

clustering_algorithm

A string indicating which clustering algorithm MetaClustering should use to perform the metaclustering. Options are "consensus" (the default), "hierarchical", "kmeans", and "som" (i.e. self-organizing map; the FlowSOM algorithm itself).

...

Optional additional arguments to pass to MetaClustering.

Value

A tibble with a single column ('.flowsom_metacluster') and the same number of rows as the input 'tof_tibble'. Each entry in the column indicates the metacluster label assigned to the same row in 'tof_tibble'.

See Also

Other metaclustering functions: tof_metacluster(), tof_metacluster_consensus(), tof_metacluster_hierarchical(), tof_metacluster_kmeans(), tof_metacluster_phenograph()

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = rnorm(n = 1000),
        cd34 = rnorm(n = 1000),
        cd19 = rnorm(n = 1000),
        cluster_id = sample(letters, size = 1000, replace = TRUE)
    )

tof_metacluster_flowsom(
    tof_tibble = sim_data,
    cluster_col = cluster_id,
    clustering_algorithm = "consensus"
)

tof_metacluster_flowsom(
    tof_tibble = sim_data,
    cluster_col = cluster_id,
    clustering_algorithm = "som"
)


keyes-timothy/tidytof documentation built on Aug. 28, 2024, 8:37 a.m.