proteinatlas_heatmap: proteinatlas.org expression heatmap

Description Usage Arguments Details Value Examples

Description

proteinatlas.org expression heatmap

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
proteinatlas_heatmap(
  expr = proteinatlas_expr_fdb11,
  genes = NULL,
  samples = NULL,
  type = c("all", "Tissue", "Cell", "Blood", "Brain"),
  row_cex = 1,
  column_cex = 1,
  ramp = "Reds",
  lens = 0,
  color_ceiling = NULL,
  cluster_columns = FALSE,
  cluster_rows = FALSE,
  column_split = NULL,
  row_split = NULL,
  centered = FALSE,
  row_filter = 0,
  column_filter = 0,
  controlSamples = NULL,
  gene_names = FALSE,
  gene_im = NULL,
  gene_im_colors = NULL,
  left_annotation = NULL,
  fill_missing = TRUE,
  border = TRUE,
  useCenterGroups = TRUE,
  trim_columns = FALSE,
  rowStatsFunc = matrixStats::rowMins,
  return_type = c("heatmap", "list"),
  verbose = FALSE,
  ...
)

Arguments

expr

numeric matrix containing gene rows, and biological sample columns.

genes

character vector of genes to display. These values are matched case-insensitive to the start of values in rownames(expr), in order to facilitate pattern-matching. Specifically, the input is passed to jamba::provigrep(c(genes), rownames(expr)) which iteratively matches each input genes using case-insensitive grep().

type

character vector of one or more expression types to include, where "all" includes all columns in expr.

row_cex, column_cex

numeric value used to adjust row and column text size, where 1 is the default size. Text size is auto-scaled based upon the number of rows and columns being displayed; these values adjust the auto-scaled font size.

ramp

character name of a color ramp, or a character vector of R colors, used to create a color ramp.

lens

numeric value used to adjust the intensity of the color ramp, where values > 0 make colors change earlier, and values < 0 make colors change later.

color_ceiling

numeric value indicating the maximum used for the color gradient, using units as displayed on the heatmap (exponentiated.) If NULL then the maximum value is used.

cluster_columns, cluster_rows

logical indicating whether to cluster columns and rows, respectively.

centered

logical indicating whether data should be centered by the median expression across all samples. Note that median expression in many cases is zero, for genes not widely expressed across all samples being displayed.

column_filter, row_filter

numeric value which hides columns or rows when the column or row maximum expression is not at or above this numeric threshold. The value filtered is the expression value indicated on the heatmap, the normal expression value, not log2-transformed.

controlSamples

character vector, or NULL, indicating specific samples to use when centering data by row. When controlSamples=NULL, the default is to use all Tissue samples.

gene_names

logical indicating whether to display full gene names, provided using genejam::freshenGenes().

gene_im

matrix intended to be an incidence matrix, whose values are 0 and 1, with gene symbols as rownames, and various columns indicating different annotations. When a rowname in the heatmap is not found in the incidence matrix, it subtitutes 0 to fill the empty space. See examples to see how to use the proteinatlas_genesets_fdb11 data.

gene_im_colors

function or vector that contains data to define colors for gene_im, using methods sufficient for ComplexHeatmap::HeatmapAnnotation(). When gene_im_colors is NULL a default function is used which is intended only for c(0, 1).

left_annotation

optional heatmap annotation as produced by ComplexHeatmap::HeatmapAnnotation(), as an alternative to supplying gene_im. Note than when gene_im is supplied, it will replace left_annotation. This option is available only to allow supplying a customized side annotation.

fill_missing

logical indicating whether the input genes should bo used as-is with no pattern matching, and by substituting 0 for any missing entries. This argument is useful when trying to align this heatmap with another existing Heatmap object, where the rows must be exactly aligned.

border

logical indicating whether to draw a heatmap border.

useCenterGroups

logical used when centered=TRUE, and when controlSamples are provided, will center each sample type independently, instead of centering all samples versus "Tissue" type. We decided by default to center all samples versus "Tissue" by default, because some immune-specific genes are highly expressed in all "Blood" samples but not "Tissue", and it was visually confusing.

trim_columns

logical indicating whether to trim the colnames to remove the sample type suffix.

rowStatsFunc

function used to define the per-row value used when centered=TRUE, passed to jamma::centerGeneData(). The default is to use the matrixStats::rowMins() which returns the minimum expression value per row.

return_type

character string indicating the type of return object: "heatmap" returns the Heatmap object sufficient to be plotted using ComplexHeatmap::draw(); "list" returns list of relevant data components used to produce the heatmap, sufficient for reviewing more details. The list output also includes data actually used in the heatmap, after the expression centering, expression filtering, and sample subsetting, as relevant. if relevant.

verbose

logical indicating whether to print verbose output.

...

additional arguments are passed to ComplexHeatmap::Heatmap() for other customizations.

Details

This function takes proteinatlas expression data expr, and creates a heatmap of expression using a subset of genes provided as genes.

By default, columns in expr are split by type, where colnames(expr) are expected to have suffix " - Type" at the end of each column name. If the columns cannot be split accordingly, then all columns are assigned one split name "Expression".

To customize data for individual samples, the expr data should be filtered before calling this function.

By default, the row-centering method centers each row by the row minimum expression using only "Tissue" samples, so that expression will be displayed relative to the lowest tissue expression. An optional set of control samples can be provided with argument controlSamples.

Value

Heatmap produced by ComplexHeatmap::Heatmap() by default, when return_type="heatmap"; when return_type="list" it returns a list with components used in the heatmap, perhaps most important is the actual expression data matrix after expression centering, expression filtering, and sample subsetting operations, as relevant. The list also includes the Heatmap under element "hm", so it can be plotted using ComplexHeatmap::draw().

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
test_genes <- c("DKK1","DKK4","CXCL12","IL6R","MET",
   "HK2","FTL","FTH1","STAT1","STAT3","CDKN1B");
proteinatlas_heatmap(genes=test_genes,
   type="Blood",
   centered=FALSE,
   cluster_rows=TRUE,
   cluster_columns=TRUE,
   row_filter=2)

# use proteinatlas_genesets_fdb11
use_im <- c("secreted_proteins",
   "membrane_proteins",
   "NOT_membrane_secreted",
   "TFs");
proteinatlas_im <- list2im_opt(proteinatlas_genesets_fdb11[use_im]);
test_genes <- c("DKK1","DKK4","CXCL12","IL6R","MET",
   "HK2","FTL","FTH1","STAT1","STAT3","CDKN1B");
proteinatlas_heatmap(genes=test_genes,
   type="Blood",
   centered=TRUE,
   gene_im=proteinatlas_im);

jmw86069/pajam documentation built on Feb. 6, 2022, 1:30 p.m.