tof_cluster_ddpr: Perform developmental clustering on high-dimensional...
In keyes-timothy/tidytof: Analyze High-dimensional Cytometry Data Using Tidy Data Principles

tof_cluster_ddpr

R Documentation

Perform developmental clustering on high-dimensional cytometry data.

Description

This function performs distance-based clustering on high-dimensional cytometry data by sorting cancer cells (passed into the function as 'tof_tibble') into their most phenotypically similar healthy cell subpopulation (passed into the function using 'healthy_tibble'). For details about the algorithm used to perform the clustering, see this paper.

Usage

tof_cluster_ddpr(
  tof_tibble,
  healthy_tibble,
  healthy_label_col,
  cluster_cols = where(tof_is_numeric),
  distance_function = c("mahalanobis", "cosine", "pearson"),
  num_cores = 1L,
  parallel_cols,
  return_distances = FALSE,
  verbose = FALSE
)

Arguments

`tof_tibble`	A 'tibble' or 'tof_tbl' containing cells to be classified into their nearest healthy subpopulation (generally cancer cells).
`healthy_tibble`	A 'tibble' or 'tof_tibble' containing cells from only healthy control samples (i.e. not disease samples).
`healthy_label_col`	An unquoted column name indicating which column in 'healthy_tibble' contains the subpopulation label (or cluster id) for each cell in 'healthy_tibble'.
`cluster_cols`	Unquoted column names indicating which columns in 'tof_tibble' to use in computing the DDPR clusters. Defaults to all numeric columns in 'tof_tibble'. Supports tidyselect helpers.
`distance_function`	A string indicating which distance function should be used to perform the classification. Options are "mahalanobis" (the default), "cosine", and "pearson".
`num_cores`	An integer indicating the number of CPU cores used to parallelize the classification. Defaults to 1 (a single core).
`parallel_cols`	Optional. Unquoted column names indicating which columns in 'tof_tibble' to use for breaking up the data in order to parallelize the classification using 'foreach' on a 'doParallel' backend. Supports tidyselect helpers.
`return_distances`	A boolean value indicating whether or not the returned result should include only one column, the cluster ids corresponding to each row of 'tof_tibble' (return_distances = FALSE, the default), or if the returned result should include additional columns representing the distance between each row of 'tof_tibble' and each of the healthy subpopulation centroids (return_distances = TRUE).
`verbose`	A boolean value indicating whether progress updates should be printed during developmental classification. Default is FALSE.

Value

If 'return_distances = FALSE', a tibble with one column named '.{distance_function}_cluster', a character vector of length 'nrow(tof_tibble)' indicating the id of the developmental cluster to which each cell (i.e. each row) in 'tof_tibble' was assigned.

If 'return_distances = TRUE', a tibble with 'nrow(tof_tibble)' rows and 'nrow(classifier_fit) + 1' columns. Each row represents a cell from 'tof_tibble', and 'nrow(classifier_fit)' of the columns represent the distance between the cell and each of the healthy subpopulations' cluster centroids. The final column represents the cluster id of the healthy subpopulation with the minimum distance to the cell represented by that row.

If 'return_distances = FALSE', a tibble with one column named '.{distance_function}_cluster'. This column will contain an integer vector of length 'nrow(tof_tibble)' indicating the id of the developmental cluster to which each cell (i.e. each row) in 'tof_tibble' was assigned.

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = rnorm(n = 1000),
        cd34 = rnorm(n = 1000),
        cd19 = rnorm(n = 1000)
    )

healthy_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 200),
        cd38 = rnorm(n = 200),
        cd34 = rnorm(n = 200),
        cd19 = rnorm(n = 200),
        cluster_id = c(rep("a", times = 100), rep("b", times = 100))
    )

tof_cluster_ddpr(
    tof_tibble = sim_data,
    healthy_tibble = healthy_data,
    healthy_label_col = cluster_id
)

keyes-timothy/tidytof documentation built on Aug. 28, 2024, 8:37 a.m.

keyes-timothy/tidytof index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

keyes-timothy/tidytof
Analyze High-dimensional Cytometry Data Using Tidy Data Principles

tof_cluster_ddpr: Perform developmental clustering on high-dimensional...
In keyes-timothy/tidytof: Analyze High-dimensional Cytometry Data Using Tidy Data Principles

Perform developmental clustering on high-dimensional cytometry data.

Description

Usage

Arguments

Value

See Also

Examples

Related to tof_cluster_ddpr in keyes-timothy/tidytof...

R Package Documentation

Browse R Packages

We want your feedback!

keyes-timothy/tidytof Analyze High-dimensional Cytometry Data Using Tidy Data Principles

tof_cluster_ddpr: Perform developmental clustering on high-dimensional... In keyes-timothy/tidytof: Analyze High-dimensional Cytometry Data Using Tidy Data Principles

Perform developmental clustering on high-dimensional cytometry data.

Description

Usage

Arguments

Value

See Also

Examples

Related to tof_cluster_ddpr in keyes-timothy/tidytof...

R Package Documentation

Browse R Packages

We want your feedback!

keyes-timothy/tidytof
Analyze High-dimensional Cytometry Data Using Tidy Data Principles

tof_cluster_ddpr: Perform developmental clustering on high-dimensional...
In keyes-timothy/tidytof: Analyze High-dimensional Cytometry Data Using Tidy Data Principles