foresttopr: Create a forest plot from one or more association result...
In topr: Create Custom Plots for Viewing Genetic Association Results

foresttopr

R Documentation

Create a forest plot from one or more association result tables

Description

foresttopr() creates a forest plot visualizing effect estimates and confidence intervals across one or more datasets. The function supports odds ratios (OR) and regression coefficients (beta), allows matching rows across datasets by a key column, and optionally displays human-readable labels from a separate annotation column.

Effect estimates are automatically standardized across input datasets, and confidence intervals are derived preferentially from explicit bounds, standard errors, or p-values when necessary.

Usage

foresttopr(
  dat = NULL,
  legend_labels = NULL,
  colors = NULL,
  key_col = "ID",
  label_col = NULL,
  effect_type = c("OR", "beta"),
  xlim = NULL,
  xbreaks = NULL,
  xlabel = NULL,
  size = 2.5,
  shape = 16,
  alpha = 1,
  points_dist = 0.6,
  band_color = "grey96",
  band_border_color = "grey96",
  band_border_linewidth = 0.01,
  sign_thresh = NULL,
  ylabel_order = NULL,
  scale = 1,
  title = NULL,
  title_text_size = 15,
  axis_text_size = 12,
  axis_title_size = 14,
  show_shape_legend = TRUE,
  show_color_legend = TRUE,
  legend_position = "right",
  legend_nrow = NULL,
  legend_name = NULL,
  legend_title_size = axis_text_size * 0.95,
  legend_text_size = axis_text_size * 0.85,
  match_on_gene = FALSE
)

Arguments

`dat`	A data frame or a list of data frames containing association results. Each data frame must contain an effect estimate column (e.g. OR or BETA) and a p-value column. If a single data frame is provided, it is internally wrapped into a list.
`legend_labels`	A character vector of labels corresponding to each dataset in `dat`. These labels are used in the plot legend. Defaults to `"Set1"`, `"Set2"`, etc.
`colors`	A character vector of colors to use for each dataset. If `NULL`, a default color palette is used.
`key_col`	Character scalar giving the column name used to match rows across datasets (e.g. gene identifier or variant ID). Defaults to `"gene"`.
`label_col`	Optional character scalar giving the column name in the reference dataset (the first element of `dat`) to use for labeling rows on the y-axis. If `NULL`, `key_col` is used for labeling.
`effect_type`	Character scalar specifying the effect scale to plot. Either `"OR"` (odds ratio; default) or `"beta"` (regression coefficient). Matching is case-insensitive. When required, effect estimates are automatically converted between scales.
`xlim`	Numeric length-2 vector giving x-axis limits. If `NULL`, limits are computed automatically from the data.
`xbreaks`	Numeric vector or function specifying x-axis breaks. If `NULL`, reasonable defaults are chosen based on `effect_type`.
`xlabel`	Character scalar giving the x-axis label. If `NULL`, a default label is chosen based on `effect_type`.
`size`	Numeric scalar or vector controlling point sizes for each dataset.
`shape`	Integer scalar or vector specifying point shapes for each dataset.
`alpha`	Numeric scalar or vector specifying point transparency.
`points_dist`	Numeric scalar controlling horizontal separation of points from different datasets within the same row.
`band_color`	Background color for alternating row bands.
`band_border_color`	Color for row band borders.
`band_border_linewidth`	Numeric scalar giving the line width for row band borders.
`sign_thresh`	Optional numeric scalar specifying a p-value threshold for highlighting statistically significant points via shape encoding.
`ylabel_order`	Optional character vector specifying the order of rows on the y-axis. If `NULL`, rows are ordered as they appear in the reference dataset.
`scale`	Numeric scalar used to globally scale text and point sizes.
`title`	Optional character scalar giving the plot title.
`title_text_size`	Numeric scalar controlling title text size.
`axis_text_size`	Numeric scalar controlling axis text size.
`axis_title_size`	Numeric scalar controlling axis title text size.
`show_shape_legend`	Logical; whether to display the shape legend.
`show_color_legend`	Logical; whether to display the color legend.
`legend_position`	Character string specifying legend position. One of `"right"`, `"top"`, or `"bottom"`.
`legend_nrow`	Optional integer specifying the number of rows in the legend.
`legend_name`	Optional character scalar giving the legend title.
`legend_title_size`	Numeric scalar controlling legend title text size.
`legend_text_size`	Numeric scalar controlling legend text size.
`match_on_gene`	Logical; if `FALSE` and vdariant-level columns (e.g. REF/ALT) are detected, matching is performed at the variant level. Otherwise, matching is performed using `key_col`.

Value

A ggplot2 object representing the forest plot.

Examples

foresttopr(
  dat = list(
    CD_UKBB |>
      dplyr::arrange(P) |>
      head(n = 10) |>
      annotate_with_nearest_gene(),
    CD_FINNGEN
  ),
  key_col = "ID",
  label_col = "Gene_Symbol",
  legend_labels = c("CD_UKBB", "CD_FINNGEN"),
  effect_type = "beta"
)

topr documentation built on April 13, 2026, 5:07 p.m.