| lmforest | R Documentation |
Generates a publication-ready forest plot that combines a formatted data table
with a graphical representation of regression coefficients from a linear model.
The plot integrates variable names, group levels, sample sizes, coefficients
with confidence intervals, p-values, and model diagnostics (R^2,
F-statistic, AIC) in a single comprehensive visualization designed for
manuscripts and presentations.
lmforest(
x,
data = NULL,
title = "Linear Model",
effect_label = "Coefficient",
digits = 2,
p_digits = 3,
conf_level = 0.95,
font_size = 1,
annot_size = 3.88,
header_size = 5.82,
title_size = 23.28,
plot_width = NULL,
plot_height = NULL,
table_width = 0.6,
show_n = TRUE,
indent_groups = FALSE,
condense_table = FALSE,
bold_variables = FALSE,
center_padding = 4,
zebra_stripes = TRUE,
ref_label = "reference",
labels = NULL,
units = "in",
color = "#5A8F5A",
qc_footer = TRUE,
number_format = NULL
)
x |
Either a fitted linear model object (class |
data |
Data frame or data.table containing the original data used to
fit the model. If |
title |
Character string specifying the plot title displayed at the top.
Default is |
effect_label |
Character string for the effect measure label on the
forest plot axis. Default is |
digits |
Integer specifying the number of decimal places for coefficients and confidence intervals. Default is 2. |
p_digits |
Integer specifying the number of decimal places for
p-values. Values smaller than |
conf_level |
Numeric confidence level for confidence intervals. Must be
between 0 and 1. Default is 0.95 (95% confidence intervals). The CI
percentage is automatically displayed in column headers (e.g., "90% CI"
when |
font_size |
Numeric multiplier controlling the base font size for all text elements. Default is 1.0. |
annot_size |
Numeric value controlling the relative font size for data annotations. Default is 3.88. |
header_size |
Numeric value controlling the relative font size for column headers. Default is 5.82. |
title_size |
Numeric value controlling the relative font size for the main plot title. Default is 23.28. |
plot_width |
Numeric value specifying the intended output width in
specified |
plot_height |
Numeric value specifying the intended output height in
specified |
table_width |
Numeric value between 0 and 1 specifying the proportion of total plot width allocated to the data table. Default is 0.6. |
show_n |
Logical. If |
indent_groups |
Logical. If |
condense_table |
Logical. If |
bold_variables |
Logical. If |
center_padding |
Numeric value specifying horizontal spacing between table and forest plot. Default is 4. |
zebra_stripes |
Logical. If |
ref_label |
Character string to display for reference categories of
factor variables. Default is |
labels |
Named character vector providing custom display labels for
variables. Example: |
units |
Character string specifying units for plot dimensions:
|
color |
Character string specifying the color for coefficient point
estimates in the forest plot. Default is |
qc_footer |
Logical. If |
number_format |
Character string or two-element character vector controlling thousand and decimal separators in formatted output. Named presets:
Or provide a custom two-element vector When
options(summata.number_format = "eu")
|
Linear Model-Specific Features:
The linear model forest plot differs from logistic and Cox plots in several ways:
Coefficients: Raw regression coefficients shown (not exponentiated)
Reference line: At coefficient = 0 (not at 1)
Linear scale: Forest plot uses linear scale (not log scale)
No events column: Only sample sizes shown (no event counts)
R^2 statistics: Model fit assessed by R^2 and adjusted R^2
F-test: Overall model significance from F-statistic
Plot Components:
Title: Centered at top
Data Table (left): Contains:
Variable: Predictor names
Group: Factor levels (if applicable)
n: Sample sizes by group
Coefficient (95% CI); p-value: Raw coefficients with CIs and p-values
Forest Plot (right):
Point estimates (squares sized by sample size)
95% confidence intervals (error bars)
Reference line at coefficient = 0
Linear scale
Model Statistics (footer):
Observations analyzed (with percentage of total data)
R^2 and adjusted R^2
F-statistic with degrees of freedom and p-value
AIC
Interpreting Coefficients:
Linear regression coefficients represent the change in the outcome variable for a one-unit change in the predictor:
Continuous predictors: Coefficient = change in Y per unit of X
Binary predictors: Coefficient = difference in Y between groups
Factor predictors: Coefficients = differences from reference category
Sign matters: Positive = increase in Y, Negative = decrease in Y
Zero crossing: CI crossing zero suggests no significant effect
Example: If the coefficient for "age" is 0.50 when predicting BMI,
BMI increases by 0.50 kg/m^2 for each additional year of age.
Model Fit Statistics:
The footer displays key diagnostics:
R^2: Proportion of variance explained (0 to 1)
0.0-0.3: Weak explanatory power
0.3-0.5: Moderate
0.5-0.7: Good
> 0.7: Strong (rare in social/biological sciences)
Adjusted R^2: R^2 penalized for number of predictors
Always \le R^2
Preferred for model comparison
Accounts for model complexity
F-statistic: Tests null hypothesis that all coefficients = 0
Degrees of freedom: df1 = # predictors, df2 = # observations - # predictors - 1
Significant p-value indicates model explains variance better than intercept-only
AIC: For model comparison (lower is better)
Assumptions:
Linear regression assumes:
Linearity of relationships
Independence of observations
Homoscedasticity (constant variance)
Normality of residuals
No multicollinearity
Check assumptions using:
plot(model) for diagnostic plots
car::vif(model) for multicollinearity
lmtest::bptest(model) for heteroscedasticity
shapiro.test(residuals(model)) for normality
Reference Categories:
For factor variables:
First level is the reference (coefficient = 0)
Other levels show difference from reference
Reference displayed with ref_label
Relevel factors before modeling if needed:
factor(x, levels = c("desired_ref", ...))
Sample Size Reporting:
The "n" column shows:
For continuous variables: Total observations with non-missing data
For factor variables: Number of observations in each category
Footer shows total observations analyzed and percentage of original data (accounting for missing values)
A ggplot object containing the complete forest plot. The plot
can be:
Displayed directly: print(plot)
Saved to file: ggsave("forest.pdf", plot, width = 12, height = 8)
Further customized with ggplot2 functions
The returned object includes an attribute "rec_dims"
accessible via attr(plot, "rec_dims"), which is a list
containing:
Numeric. Recommended plot width in specified units
Numeric. Recommended plot height in specified units
These recommendations are automatically calculated based on the number of
variables, text sizes, and layout parameters, and are printed to console
if plot_width or plot_height are not specified.
autoforest for automatic model detection,
glmforest for logistic/GLM forest plots,
coxforest for Cox model forest plots,
uniforest for univariable screening forest plots,
multiforest for multi-outcome forest plots,
lm for fitting linear models,
fit for regression modeling
Other visualization functions:
autoforest(),
coxforest(),
glmforest(),
multiforest(),
uniforest()
data(clintrial)
data(clintrial_labels)
# Create example model
model1 <- lm(bmi ~ age + sex + smoking, data = clintrial)
# Example 1: Basic linear model forest plot
p <- lmforest(model1, data = clintrial)
old_width <- options(width = 180)
# Example 2: With custom labels and title
plot2 <- lmforest(
x = model1,
data = clintrial,
title = "Predictors of Body Mass Index",
effect_label = "Change in BMI (kg/m^2)",
labels = clintrial_labels
)
# Example 3: Comprehensive model with indented layout
model3 <- lm(
bmi ~ age + sex + smoking + hypertension + diabetes + creatinine,
data = clintrial
)
plot3 <- lmforest(
x = model3,
data = clintrial,
labels = clintrial_labels,
indent_groups = TRUE,
zebra_stripes = TRUE
)
# Example 4: Condensed layout
plot4 <- lmforest(
x = model3,
data = clintrial,
condense_table = TRUE,
labels = clintrial_labels
)
# Example 5: Different outcome (hemoglobin)
model5 <- lm(
hemoglobin ~ age + sex + bmi + smoking + creatinine,
data = clintrial
)
plot5 <- lmforest(
x = model5,
data = clintrial,
title = "Predictors of Baseline Hemoglobin",
effect_label = "Change in Hemoglobin (g/dL)",
labels = clintrial_labels
)
# Example 6: Save with recommended dimensions
dims <- attr(plot5, "rec_dims")
ggplot2::ggsave(file.path(tempdir(), "linear_forest.pdf"),
plot5, width = dims$width, height = dims$height)
options(old_width)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.