ggplot_heatmap: clustered heatmap of diag(g) * X
In MarianSchoen/DTD: Digital Tissue Deconvolution

ggplot_heatmap

R Documentation

clustered heatmap of diag(g) * X

Description

For a DTD.model the 'ggplot_heatmap' function visualizes

diag(g) * X

on a subset of features as a clustered heatmap.
Feature subsetting can either be done by a vector of strings, that match the feature names of X.
Alternatively, via 'explained correlation': In order to assess the importance of a feature in the deconvolution process, we can exclude the feature from a trained model, and observe the change of correlaion on a test set. If the correlation e.g. decreases by 1 explains 1 The 'ggplot_heatmap' function iteratively excludes each feature from the trained model, resulting in a ranking for the genes.

Usage

ggplot_heatmap(
  DTD.model,
  X.matrix = NA,
  test.data = NULL,
  estimate.c.type = "decide.on.model",
  title = "",
  feature.subset = 100,
  log2.expression = TRUE
)

Arguments

`DTD.model`	either a numeric vector with length of nrow(X), or a list returned by `train_deconvolution_model`, `DTD_cv_lambda_cxx`, or `descent_generalized_fista`. In the equation above the DTD.model provides the vector g.
`X.matrix`	numeric matrix, with features/genes as rows, and cell types as column. Each column of X.matrix is a reference expression profile. A trained DTD model includes X.matrix, it has been trained on. Therefore, X.matrix should only be set, if the 'DTD.model' is not a DTD model.
`test.data`	list, with two entries, a numeric matrix each, named 'mixtures' and 'quantities' For examples see `mix_samples`, `mix_samples_with_jitter` or the package vignette 'browseVignettes("DTD")'.
`estimate.c.type`	string, either "non_negative", or "direct". Indicates how the algorithm finds the solution of arg min_C \|\|diag(g)(Y - XC)\|\|_2. If 'estimate.c.type' is set to "direct", there is no regularization (see `estimate_c`), if 'estimate.c.type' is set to "non_negative", the estimates "C" must not be negative (non-negative least squares) (see (see `estimate_nn_c`))
`title`	string, additionally title
`feature.subset`	numeric or a vector of strings. If it is a numeric, "subset" features will be picked from the explained correlation' ranking (if 'feature.subset' <= 1, this is the fraction of feature, if 'feature.subset' > 1 it is the total amount). If it is a vector of strings, these features will be used (if they intersect with rownames(X.matrix))
`log2.expression`	logical, in the heatmap, should the values be log transformed?