| glmforest | R Documentation |
Generates a publication-ready forest plot that combines a formatted data table with a graphical representation of effect estimates (odds ratios, risk ratios, or coefficients) from a generalized linear model. The plot integrates variable names, group levels, sample sizes, effect estimates with confidence intervals, p-values, and model diagnostics in a single comprehensive visualization designed for manuscripts and presentations.
glmforest(
x,
data = NULL,
title = "Generalized Linear Model",
effect_label = NULL,
digits = 2,
p_digits = 3,
conf_level = 0.95,
font_size = 1,
annot_size = 3.88,
header_size = 5.82,
title_size = 23.28,
plot_width = NULL,
plot_height = NULL,
table_width = 0.6,
show_n = TRUE,
show_events = TRUE,
indent_groups = FALSE,
condense_table = FALSE,
bold_variables = FALSE,
center_padding = 4,
zebra_stripes = TRUE,
ref_label = "reference",
labels = NULL,
color = NULL,
exponentiate = NULL,
qc_footer = TRUE,
units = "in",
number_format = NULL
)
x |
Either a fitted GLM object (class |
data |
Data frame or data.table containing the original data used to
fit the model. If |
title |
Character string specifying the plot title displayed at the top.
Default is |
effect_label |
Character string for the effect measure label on the
forest plot axis. If |
digits |
Integer specifying the number of decimal places for effect estimates and confidence intervals in the data table. Default is 2. |
p_digits |
Integer specifying the number of decimal places for
p-values. Values smaller than |
conf_level |
Numeric confidence level for confidence intervals. Must be
between 0 and 1. Default is 0.95 (95% confidence intervals). The CI
percentage is automatically displayed in column headers (e.g., "90% CI"
when |
font_size |
Numeric multiplier controlling the base font size for all text elements. Values > 1 increase all fonts proportionally, values < 1 decrease them. Default is 1.0. Useful for adjusting readability across different output sizes. |
annot_size |
Numeric value controlling the relative font size for
data annotations (variable names, values in table cells). Default is 3.88.
Adjust relative to |
header_size |
Numeric value controlling the relative font size for column headers ("Variable", "Group", "n", etc.). Default is 5.82. Headers are typically larger than annotations for hierarchy. |
title_size |
Numeric value controlling the relative font size for the main plot title. Default is 23.28. The title is typically the largest text element. |
plot_width |
Numeric value specifying the intended output width in
specified |
plot_height |
Numeric value specifying the intended output height in
specified |
table_width |
Numeric value between 0 and 1 specifying the proportion of
total plot width allocated to the data table (left side). The forest plot
occupies |
show_n |
Logical. If |
show_events |
Logical. If |
indent_groups |
Logical. If |
condense_table |
Logical. If |
bold_variables |
Logical. If |
center_padding |
Numeric value specifying the horizontal spacing (in character units) between the data table and forest plot. Increase for more separation, decrease to fit more content. Default is 4. |
zebra_stripes |
Logical. If |
ref_label |
Character string to display for reference categories of
factor variables. Typically shown in place of effect estimates.
Default is |
labels |
Named character vector or list providing custom display
labels for variables. Names should match variable names in the model,
values are the labels to display. Example:
|
color |
Character string specifying the color for effect estimate point
markers in the forest plot. Use hex codes or R color names. Default is
Gaussian with log link), and |
exponentiate |
Logical. If |
qc_footer |
Logical. If |
units |
Character string specifying the units for plot dimensions.
Options: |
number_format |
Character string or two-element character vector controlling thousand and decimal separators in formatted output. Named presets:
Or provide a custom two-element vector When
options(summata.number_format = "eu")
|
Plot Components:
The forest plot consists of several integrated components:
Title: Centered at top, describes the analysis
Data Table (left side): Contains columns for:
Variable: Predictor names (or custom labels)
Group: Factor levels (optional, hidden when indenting)
n: Sample sizes by group (optional)
Events: Event counts by group (optional)
Effect (95% CI); p-value: Formatted estimates with p-values
Forest Plot (right side): Graphical display with:
Point estimates (squares sized by sample size)
95% confidence intervals (error bars)
Reference line (at OR/RR = 1 or coefficient = 0)
Log scale for odds/risk ratios
Labeled axis
Model Statistics (footer): Summary of:
Observations analyzed (with percentage of total data)
Model family (Binomial, Poisson, etc.)
Deviance statistics
Pseudo-R^2 (McFadden)
AIC
Automatic Effect Measure Selection:
When effect_label = NULL and exponentiate = NULL, the function
intelligently selects the appropriate effect measure:
Logistic regression (family = binomial(link = "logit")):
Odds Ratios (OR)
Log-link models (link = "log"): Risk Ratios (RR)
or Rate Ratios
Other exponential families: exp(coefficient)
Identity link: Raw coefficients
Reference Categories:
For factor variables, the first level (determined by factor ordering or alphabetically for character variables) serves as the reference category:
Displayed with the ref_label instead of an estimate
No confidence interval or p-value shown
Visually aligned with other categories
When condense_table = TRUE, reference-only variables may be
omitted entirely
Layout Optimization:
The function automatically optimizes layout based on content:
Calculates appropriate axis ranges to accommodate all confidence intervals
Selects meaningful tick marks on log or linear scales
Sizes point markers proportional to sample size (larger = more data)
Adjusts table width based on variable name lengths when table_width = NULL
Recommends overall dimensions based on number of rows
Visual Grouping Options:
Three display modes are available:
Standard (indent_groups = FALSE,
condense_table = FALSE):
Separate "Variable" and "Group" columns, all categories shown
Indented (indent_groups = TRUE,
condense_table = FALSE):
Hierarchical display with groups indented under variables
Condensed (condense_table = TRUE):
Binary variables shown in single rows, automatically indented
Zebra Striping:
When zebra_stripes = TRUE, alternating variables (not individual rows)
receive light gray backgrounds. This helps visually group all levels of a
factor variable together, making the plot easier to read especially with
many multi-level factors.
Model Statistics Display:
The footer shows key diagnostic information:
Observations analyzed: Total N and percentage of original data (accounting for missing values)
Null/Residual Deviance: Model fit improvement
Pseudo-R^2: McFadden R^2 = 1 - (log L_1 / log L_2)
AIC: For model comparison (lower is better)
For logistic regression, concordance (C-statistic/AUC) may also be displayed if available.
Saving Plots:
Use ggplot2::ggsave() with recommended dimensions:
p <- glmforest(model, data)
dims <- attr(p, "rec_dims")
ggplot2::ggsave("forest.pdf", p, width = dims$width, height = dims$height)
Or specify custom dimensions:
ggplot2::ggsave("forest.png", p, width = 12, height = 8, dpi = 300)
A ggplot object containing the complete forest plot. The plot
can be:
Displayed directly: print(plot)
Saved to file: ggsave("forest.pdf", plot, width = 12, height = 8)
Further customized with ggplot2 functions
The returned object includes an attribute "rec_dims"
accessible via attr(plot, "rec_dims"), which is a list
containing:
Numeric. Recommended plot width in specified units
Numeric. Recommended plot height in specified units
These recommendations are automatically calculated based on the number of
variables, text sizes, and layout parameters, and are printed to console
if plot_width or plot_height are not specified.
autoforest for automatic model detection,
coxforest for Cox proportional hazards forest plots,
lmforest for linear model forest plots,
uniforest for univariable screening forest plots,
multiforest for multi-outcome forest plots,
glm for fitting GLMs,
fit for regression modeling
Other visualization functions:
autoforest(),
coxforest(),
lmforest(),
multiforest(),
uniforest()
data(clintrial)
data(clintrial_labels)
# Create example model
model1 <- glm(os_status ~ age + sex + bmi + treatment,
data = clintrial, family = binomial)
# Example 1: Basic logistic regression forest plot
p <- glmforest(model1, data = clintrial)
old_width <- options(width = 180)
# Example 2: With custom variable labels
plot2 <- glmforest(
x = model1,
data = clintrial,
title = "Risk Factors for Mortality",
labels = clintrial_labels
)
# Example 3: Indented layout with formatting options
plot3 <- glmforest(
x = model1,
data = clintrial,
indent_groups = TRUE,
zebra_stripes = TRUE,
color = "#D62728",
labels = clintrial_labels
)
# Example 4: Condensed layout for many binary variables
model4 <- glm(os_status ~ age + sex + smoking + hypertension +
diabetes + surgery,
data = clintrial,
family = binomial)
plot4 <- glmforest(
x = model4,
data = clintrial,
condense_table = TRUE,
labels = clintrial_labels
)
# Binary variables shown in single rows
# Example 5: Poisson regression for count data
model5 <- glm(ae_count ~ age + treatment + diabetes + surgery,
data = clintrial,
family = poisson)
plot5 <- glmforest(
x = model5,
data = clintrial,
title = "Rate Ratios for Adverse Events",
labels = clintrial_labels
)
# Example 6: Save with recommended dimensions
dims <- attr(plot5, "rec_dims")
ggplot2::ggsave(file.path(tempdir(), "forest.pdf"),
plot5, width = dims$width, height = dims$height)
options(old_width)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.