fg_summary: Calculates feature summary statistics.

View source: R/03_flowgraph_summary.R

fg_summaryR Documentation

Calculates feature summary statistics.

Description

Calculates feature summary statistics for flowGraph features; users can choose from a list of statistical significance tests/adjustments or define custom summary functions. For special cases, see example in function fg_add_summary on how to manually calculate summary statistics without using this function.

Usage

fg_summary(
  fg,
  no_cores = 1,
  class = "class",
  label1 = NULL,
  label2 = NULL,
  class_labels = NULL,
  node_features = "SpecEnr",
  edge_features = "NONE",
  test_name = "t_diminish",
  diminish = TRUE,
  p_thres = 0.05,
  p_rate = 2,
  test_custom = "t",
  effect_size = TRUE,
  adjust0 = TRUE,
  adjust0_lim = c(-0.1, 0.1),
  btwn = TRUE,
  btwn_test_custom = "t",
  save_functions = FALSE,
  overwrite = FALSE
)

Arguments

fg

flowGraph object.

no_cores

An integer indicating how many cores to parallelize on.

class

A string corresponding to the column name or index of fg_get_meta(fg) whose values represent the class label of each sample.

label1

A string from the class column of the meta slot indicating one of the labels compared to create the summary statistic. If you would like to compare all other class labels against one label, set label2 to NULL. If you would like to compare all labels against all labels, set label1 and label2 to NULL.

label2

A string from the class column of the meta slot indicating one of the labels compared to create the summary statistic.

class_labels

A list of vectors, each containing two strings represeting labels to compare; this parameter is an alternative to parameters label1 and label2 that supports multiple label pairings.

node_features

A string vector indicating which node feature(s) to perform summary statistics on; set to NULL or "NONE" and the function will perform summary statistics on all or no node features.

edge_features

A string vector indicating which edge feature(s) to perform summary statistics on; set to NULL or "NONE" and the function will perform summary statistics on all or no edge features.

test_name

A string with the name of the test you are performing.

diminish

A logical variable indicating whether to use diminishing summary statistics; if TRUE, a summary statistic for a node or edge will only be done if at least one of its parent node or edge is significant. Otherwise, the test will be performed on all nodes or edges.

p_thres

A double indicating the summary statistic threshold; if the result of a statistical test is greater than p_thres, then it is insignificant.

p_rate

A double; if diminish=TRUE, then p_rate needs to be specified. to determine whether or not a node or edge's parent is significant, we use p_thres. However, the higher the layer on which a node resides or to which an edge points to, the less stringent this p_thres should be. Therefore, we set p_thres as the threshold for the parent node or edge of the last layer and multiply p_thres by p_rate for each increasing layer e.g. given default values and 4 layers, the thresholds for layers 1 through 4 would be .4, .2, .1, and .05.

test_custom

A function or a string indicating the statistical test to use. If a string is provided, it should be one of c("t","wilcox","ks","var","chisq"); these correspond to statistical tests stats::t.test, stats::wilcox.test, and so on. If a function is provided, it should take as input two numeric vectors and output a numeric variable.

effect_size

A logical variable indicating whether or not to calculate effect size statistic (cohen's d) for this set of class labels; later used for plotting.

adjust0

A logical variable indicating whether or not to calculate the minimum percentage of values from samples of each class label that falls within the range of adjust0_lim. This is only done for SpecEnr values as p-values become unstable when comparing near 0 values.

adjust0_lim

A vector of two numeric values indicating a range around 0, default set to -0.1 and 0.1.

btwn

A logical variable indicating whether or not to calculate the btwn data frame given in the fg_get_summary function.

btwn_test_custom

Same as test_custom but for btwn.

save_functions

A logical variable indicating whether to save test and adjust functions.

overwrite

A logical variable indicating whether to overwrite the existing summary statistics if it exists.

Details

fg_summary calculates a summary statistic as specified by the user in parameters test_name, diminish (p_thres, p_rate), and test_custom. The test is done for a node or edge feature of interest within a given flowGraph object as specified by parameters node_features, edge_features. It then returns information on the summary statistic inside the same flowGraph object and returns it to the user. See flowGraph-class slot summary for details on the contents.

Value

flowGraph object containing claculated summary statistics.

See Also

flowGraph-class fg_clear_summary

Examples


 no_cores <- 1
 data(fg_data_pos30)
 fg <- flowGraph(fg_data_pos30$count, class=fg_data_pos30$meta$class,
                 prop=FALSE, specenr=FALSE,
                 no_cores=no_cores)
 fg_get_summary_desc(fg)

 fg <- fg_summary(fg, no_cores=no_cores, class="class", label1="control",
                  overwrite=FALSE, test_name="t", diminish=FALSE,
                  node_features="count", edge_features="NONE")
 fg_get_summary_desc(fg)


aya49/flowGraph documentation built on Feb. 4, 2024, 6:40 p.m.