barcode_ggheatmap_stat: Barcode Top Clone Heatmap

Description Usage Arguments Value Examples

View source: R/barcode_ggheatmap_stat.R

Description

Creates a heatmap from the columns of data in the Summarized Experiment object, with the option to label based on statistical analysis. Uses ggplot2.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
barcode_ggheatmap_stat(
  your_SE,
  sample_size,
  stat_test = "chi-squared",
  stat_option = "subsequent",
  reference_sample = NULL,
  stat_display = "top",
  show_all_significant = FALSE,
  p_threshold = 0.05,
  p_adjust = "none",
  bc_threshold = 0,
  plot_labels = NULL,
  n_clones = 10,
  cellnote_assay = "stars",
  your_title = NULL,
  grid = TRUE,
  label_size = 12,
  dendro = FALSE,
  cellnote_size = 4,
  distance_method = "Euclidean",
  minkowski_power = 2,
  hclust_linkage = "complete",
  row_order = "hierarchical",
  clusters = 0,
  percent_scale = c(0, 2.5e-05, 0.001, 0.01, 0.1, 1),
  color_scale = c("#4575B4", "#4575B4", "lightblue", "#fefeb9", "#D73027", "red4"),
  return_table = FALSE
)

Arguments

your_SE

A Summarized Experiment object.

sample_size

A numeric vector providing the sample size of each column of the SummarizedExperiment passed to the function. This sample size describes the samples that the barcoding data is meant to approximate.

stat_test

The statistical test to use on the constructed contingency table for each barcoe. Options are "chi-squared" and "fisher."

stat_option

For "subsequent" statistical testing is performed on each column of data compared to the column before it. For "reference," all other columns of data are compared to a reference column.

reference_sample

Provide the column name of the reference column if stat_option is set to "reference." Defaults to the first column in the SummarizedExperiment.

stat_display

Choose which clones to display on the heatmap. IF set to "top," the top n_clones ranked by abundance for each sample will be displayed. If set to "change," the top n_clones with the lowest p-value from statistical testing will be shown for each sample. If set to "increase," the top n_clones (ranked by p-value) which increase in abundance for each sample will be shown. And if set to "decrease," the top n_clones (ranked by lowest p-value) which decrease in abdundance will be shown.

show_all_significant

Logical. If set to TRUE when stat_display = "change," "increase," or "decrease" then the n_clones argument will be overriden and all clones with a statistically singificant change, increase, or decrease in proportion will be shown.

p_threshold

The p_value threshold to use for statistical testing

p_adjust

Character, default = "none". To correct p-values for muiltiple comparisons, set to any of the p value adjustment methods in the p.adjust function in R stats, which includes "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", and "fdr".

bc_threshold

Clones must be above this proportion in at least one sample to be included in statistical testing.

plot_labels

Vector of x axis labels. Defaults to colnames(your_SE).

n_clones

The top 'n' clones to plot.

cellnote_assay

Character. One of "stars", "reads", "proportions" or "p_val"

your_title

The title for the plot.

grid

Logical. Include a grid or not in the heatmap.

label_size

The size of the column labels.

dendro

Logical. Whether or not to show row dendrogram when hierarchical clustering.

cellnote_size

The numerical size of the cell note labels.

distance_method

Character. Use summary(proxy::pr_DB) to see all possible options for distance metrics in clustering.

minkowski_power

The power of the Minkowski distance (if minkowski is the distance method used).

hclust_linkage

Character. One of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

row_order

Character; "hierarchical" to perform hierarchical clustering on the output and order in that manner, "emergence" to organize rows by order of presence in data (from left to right), or a character vector of rows within the summarized experiment to plot.

clusters

How many clusters to cut hierarchical tree into for display when row_order is "hierarchical".

percent_scale

A numeric vector through which to spread the color scale (values inclusive from 0 to 1). Must be same length as color_scale.

color_scale

A character vector which indicates the colors of the color scale. Must be same length as percent_scale.

return_table

Logical. Whether or not to return table of barcode sequences with their log abundance in the 'value' column and cellnote (* indicating statistical signficant change, for example) for each sample instead of displaying a plot. Note, for more in-depth statistical analysis, use the '"barcode_stat_test' function.

Value

Displays a heatmap in the current plot window. Or if return_table is set to TRUE, returns a dataframe of the barcode sequences, log abundances, and cellnote for each sample.

Examples

1
2
3
4
5
6
7
data(wu_subset)
barcode_ggheatmap_stat(
    your_SE = wu_subset[, 1:4], sample_size = rep(5000, 4),
    stat_test = "chi-squared", stat_option = "subsequent",
    p_threshold = 0.05, n_clones = 10,
    cellnote_assay = "stars", bc_threshold = 0.005
)

dunbarlabNIH/barcodetrackR documentation built on April 26, 2021, 6:20 p.m.