ggplot_heatmap: clustered heatmap of diag(g) * X

View source: R/function_plot_heatmap.R

ggplot_heatmapR Documentation

clustered heatmap of diag(g) * X

Description

For a DTD.model the 'ggplot_heatmap' function visualizes

diag(g) * X

on a subset of features as a clustered heatmap.
Feature subsetting can either be done by a vector of strings, that match the feature names of X.
Alternatively, via 'explained correlation': In order to assess the importance of a feature in the deconvolution process, we can exclude the feature from a trained model, and observe the change of correlaion on a test set. If the correlation e.g. decreases by 1 explains 1 The 'ggplot_heatmap' function iteratively excludes each feature from the trained model, resulting in a ranking for the genes.

Usage

ggplot_heatmap(
  DTD.model,
  X.matrix = NA,
  test.data = NULL,
  estimate.c.type = "decide.on.model",
  title = "",
  feature.subset = 100,
  log2.expression = TRUE
)

Arguments

DTD.model

either a numeric vector with length of nrow(X), or a list returned by train_deconvolution_model, DTD_cv_lambda_cxx, or descent_generalized_fista. In the equation above the DTD.model provides the vector g.

X.matrix

numeric matrix, with features/genes as rows, and cell types as column. Each column of X.matrix is a reference expression profile. A trained DTD model includes X.matrix, it has been trained on. Therefore, X.matrix should only be set, if the 'DTD.model' is not a DTD model.

test.data

list, with two entries, a numeric matrix each, named 'mixtures' and 'quantities' For examples see mix_samples, mix_samples_with_jitter or the package vignette 'browseVignettes("DTD")'.

estimate.c.type

string, either "non_negative", or "direct". Indicates how the algorithm finds the solution of arg min_C ||diag(g)(Y - XC)||_2.

  • If 'estimate.c.type' is set to "direct", there is no regularization (see estimate_c),

  • if 'estimate.c.type' is set to "non_negative", the estimates "C" must not be negative (non-negative least squares) (see (see estimate_nn_c))

title

string, additionally title

feature.subset

numeric or a vector of strings. If it is a numeric, "subset" features will be picked from the explained correlation' ranking (if 'feature.subset' <= 1, this is the fraction of feature, if 'feature.subset' > 1 it is the total amount). If it is a vector of strings, these features will be used (if they intersect with rownames(X.matrix))

log2.expression

logical, in the heatmap, should the values be log transformed?

Details

For an example see section "Explained correlation" in the package vignette 'browseVignettes("DTD")'

Value

ggplot object


MarianSchoen/DTD documentation built on April 29, 2022, 1:59 p.m.