plot_hierarchical_clustering: cluster the data matrix to visually inspect which confounder...

View source: R/proteome_wide_diagnostics.R

plot_hierarchical_clusteringR Documentation

cluster the data matrix to visually inspect which confounder dominates

Description

cluster the data matrix to visually inspect which confounder dominates

Usage

plot_hierarchical_clustering(data_matrix, sample_annotation,
  sample_id_col = "FullRunName", color_list = NULL,
  factors_to_plot = NULL, fill_the_missing = 0,
  distance = "euclidean", agglomeration = "complete",
  label_samples = TRUE, label_font = 0.2, filename = NULL,
  width = 38, height = 25, units = c("cm", "in", "mm"),
  plot_title = NULL, ...)

Arguments

data_matrix

features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use help("example_proteome_matrix"))

sample_annotation

data frame with:

  1. sample_id_col (this can be repeated as row names)

  2. biological covariates

  3. technical covariates (batches etc)

. See help("example_sample_annotation")

sample_id_col

name of the column in sample_annotation table, where the filenames (colnames of the data_matrix are found).

color_list

list, as returned by sample_annotation_to_colors, where each item contains a color vector for each factor to be mapped to the color.

factors_to_plot

vector of technical and biological covariates to be plotted in this diagnostic plot (assumed to be present in sample_annotation)

fill_the_missing

numeric value determining how missing values should be substituted. If NULL, features with missing values are excluded.

distance

distance metric used for clustering

agglomeration

agglomeration methods as used by hclust

label_samples

if TRUE sample IDs (column names of data_matrix) will be printed

label_font

size of the font. Is active if label_samples is TRUE, ignored otherwise

filename

path where the results are saved. If null the object is returned to the active window; otherwise, the object is save into the file. Currently only pdf and png format is supported

width

option determining the output image width

height

option determining the output image width

units

units: 'cm', 'in' or 'mm'

plot_title

title of the plot (e.g., processing step + representation level (fragments, transitions, proteins) + purpose (meanplot/corrplot etc))

...

other parameters of plotDendroAndColors from WGCNA package

Value

No return

See Also

hclust, sample_annotation_to_colors, plotDendroAndColors

Examples


selected_batches = example_sample_annotation$MS_batch %in% 
                                              c('Batch_1', 'Batch_2')
selected_samples = example_sample_annotation$FullRunName[selected_batches]
test_matrix = example_proteome_matrix[,selected_samples]

hierarchical_clustering_plot <- plot_hierarchical_clustering(
example_proteome_matrix, example_sample_annotation,
factors_to_plot = c('MS_batch', 'Diet', 'DateTime'),
color_list = NULL,  
distance = "euclidean", agglomeration = 'complete',
label_samples = FALSE)

#with defined color scheme:
color_list <- sample_annotation_to_colors (example_sample_annotation, 
factor_columns = c('MS_batch', "Strain", "Diet", "digestion_batch"),
numeric_columns = c('DateTime', 'order'))
hierarchical_clustering_plot <- plot_hierarchical_clustering(
example_proteome_matrix, example_sample_annotation,
factors_to_plot = c('MS_batch', "Strain", 'DateTime', "digestion_batch"),
color_list = color_list,  
distance = "euclidean", agglomeration = 'complete',
label_samples = FALSE)


symbioticMe/proBatch documentation built on April 9, 2023, 11:59 a.m.