barcode_ggheatmap_stat: Barcode Top Clone Heatmap
In d93espinoza/barcodetrackR: Functions for Analyzing Cellular Barcoding Data

Description Usage Arguments Value Examples

Creates a heatmap from the columns of data in the Summarized Experiment object, with the option to label based on statistical analysis. Uses ggplot2.

barcode_ggheatmap_stat(
  your_SE,
  sample_size,
  stat_test = "chi-squared",
  stat_option = "subsequent",
  reference_sample = NULL,
  stat_display = "top",
  show_all_significant = FALSE,
  p_threshold = 0.05,
  p_adjust = "none",
  bc_threshold = 0,
  plot_labels = NULL,
  n_clones = 10,
  cellnote_assay = "stars",
  your_title = NULL,
  grid = TRUE,
  label_size = 12,
  dendro = FALSE,
  cellnote_size = 4,
  distance_method = "Euclidean",
  minkowski_power = 2,
  hclust_linkage = "complete",
  row_order = "hierarchical",
  clusters = 0,
  percent_scale = c(0, 2.5e-05, 0.001, 0.01, 0.1, 1),
  color_scale = c("#4575B4", "#4575B4", "lightblue", "#fefeb9", "#D73027", "red4"),
  return_table = FALSE
)

`your_SE`	A Summarized Experiment object.
`sample_size`	A numeric vector providing the sample size of each column of the SummarizedExperiment passed to the function. This sample size describes the samples that the barcoding data is meant to approximate.
`stat_test`	The statistical test to use on the constructed contingency table for each barcoe. Options are "chi-squared" and "fisher."
`stat_option`	For "subsequent" statistical testing is performed on each column of data compared to the column before it. For "reference," all other columns of data are compared to a reference column.
`reference_sample`	Provide the column name of the reference column if stat_option is set to "reference." Defaults to the first column in the SummarizedExperiment.
`stat_display`	Choose which clones to display on the heatmap. IF set to "top," the top n_clones ranked by abundance for each sample will be displayed. If set to "change," the top n_clones with the lowest p-value from statistical testing will be shown for each sample. If set to "increase," the top n_clones (ranked by p-value) which increase in abundance for each sample will be shown. And if set to "decrease," the top n_clones (ranked by lowest p-value) which decrease in abdundance will be shown.
`show_all_significant`	Logical. If set to TRUE when stat_display = "change," "increase," or "decrease" then the n_clones argument will be overriden and all clones with a statistically singificant change, increase, or decrease in proportion will be shown.
`p_threshold`	The p_value threshold to use for statistical testing
`p_adjust`	Character, default = "none". To correct p-values for muiltiple comparisons, set to any of the p value adjustment methods in the p.adjust function in R stats, which includes "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", and "fdr".
`bc_threshold`	Clones must be above this proportion in at least one sample to be included in statistical testing.
`plot_labels`	Vector of x axis labels. Defaults to colnames(your_SE).
`n_clones`	The top 'n' clones to plot.
`cellnote_assay`	Character. One of "stars", "reads", "proportions" or "p_val"
`your_title`	The title for the plot.
`grid`	Logical. Include a grid or not in the heatmap.
`label_size`	The size of the column labels.
`dendro`	Logical. Whether or not to show row dendrogram when hierarchical clustering.
`cellnote_size`	The numerical size of the cell note labels.
`distance_method`	Character. Use summary(proxy::pr_DB) to see all possible options for distance metrics in clustering.
`minkowski_power`	The power of the Minkowski distance (if minkowski is the distance method used).
`hclust_linkage`	Character. One of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).
`row_order`	Character; "hierarchical" to perform hierarchical clustering on the output and order in that manner, "emergence" to organize rows by order of presence in data (from left to right), or a character vector of rows within the summarized experiment to plot.
`clusters`	How many clusters to cut hierarchical tree into for display when row_order is "hierarchical".
`percent_scale`	A numeric vector through which to spread the color scale (values inclusive from 0 to 1). Must be same length as color_scale.
`color_scale`	A character vector which indicates the colors of the color scale. Must be same length as percent_scale.
`return_table`	Logical. Whether or not to return table of barcode sequences with their log abundance in the 'value' column and cellnote (* indicating statistical signficant change, for example) for each sample instead of displaying a plot. Note, for more in-depth statistical analysis, use the '"barcode_stat_test' function.

Displays a heatmap in the current plot window. Or if return_table is set to TRUE, returns a dataframe of the barcode sequences, log abundances, and cellnote for each sample.

data(wu_subset)
barcode_ggheatmap_stat(
    your_SE = wu_subset[, 1:4], sample_size = rep(5000, 4),
    stat_test = "chi-squared", stat_option = "subsequent",
    p_threshold = 0.05, n_clones = 10,
    cellnote_assay = "stars", bc_threshold = 0.005
)