calcMedians: Calculate cluster medians

View source: R/calcMedians.R

calcMediansR Documentation

Calculate cluster medians

Description

Calculate cluster medians (median expression for each cluster-sample-marker combination)

Usage

calcMedians(d_se)

Arguments

d_se

Data object from previous steps, in SummarizedExperiment format, containing cluster labels as a column in the row meta-data (from generateClusters). Column meta-data is assumed to contain a factor marker_class.

Details

Calculate median marker expression for each cluster and sample (i.e. medians for each cluster-sample-marker combination).

The data object is assumed to contain a factor marker_class in the column meta-data (see prepareData), which indicates the protein marker class for each column of data ("type", "state", or "none").

The cluster medians are required for testing for differential states within cell populations, and for plotting purposes.

Variables id_type_markers and id_state_markers are saved in the metadata slot of the output object. These can be used to identify the 'cell type' and 'cell state' markers in the list of assays in the output SummarizedExperiment object, which is useful in later steps of the 'diffcyt' pipeline.

Results are returned as a new SummarizedExperiment object, where rows = clusters, columns = samples, sheets (assays slot) = markers. Note that there is a separate table of values (assay) for each marker. The metadata slot also contains variables id_type_markers and id_state_markers, which can be used to identify the sets of cell type and cell state markers in the list of assays.

Value

d_medians: SummarizedExperiment object, where rows = clusters, columns = samples, sheets (assays slot) = markers. The metadata slot contains variables id_type_markers and id_state_markers, which can be accessed with metadata(d_medians)$id_type_markers and metadata(d_medians)$id_state_markers.

Examples

# For a complete workflow example demonstrating each step in the 'diffcyt' pipeline, 
# see the package vignette.

# Function to create random data (one sample)
d_random <- function(n = 20000, mean = 0, sd = 1, ncol = 20, cofactor = 5) {
  d <- sinh(matrix(rnorm(n, mean, sd), ncol = ncol)) * cofactor
  colnames(d) <- paste0("marker", sprintf("%02d", 1:ncol))
  d
}

# Create random data (without differential signal)
set.seed(123)
d_input <- list(
  sample1 = d_random(), 
  sample2 = d_random(), 
  sample3 = d_random(), 
  sample4 = d_random()
)

experiment_info <- data.frame(
  sample_id = factor(paste0("sample", 1:4)), 
  group_id = factor(c("group1", "group1", "group2", "group2")), 
  stringsAsFactors = FALSE
)

marker_info <- data.frame(
  channel_name = paste0("channel", sprintf("%03d", 1:20)), 
  marker_name = paste0("marker", sprintf("%02d", 1:20)), 
  marker_class = factor(c(rep("type", 10), rep("state", 10)), 
                        levels = c("type", "state", "none")), 
  stringsAsFactors = FALSE
)

# Prepare data
d_se <- prepareData(d_input, experiment_info, marker_info)

# Transform data
d_se <- transformData(d_se)

# Generate clusters
d_se <- generateClusters(d_se)

# Calculate medians
d_medians <- calcMedians(d_se)


lmweber/diffcyt documentation built on March 19, 2024, 5:24 a.m.