knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 8.5, fig.height = 5.5 )
library(ggforestplotR) library(ggplot2)
This article focuses on utility and enhanced customization of forest plots and accompanied tables.
facet creates section panels, and facet_strip_position controls which
side gets the strip labels.
coefs <- data.frame( term = c("Age", "BMI", "Smoking", "Stage II", "Stage III"), estimate = c(0.12, -0.10, 0.18, 0.30, 0.46), conf.low = c(0.03, -0.18, 0.04, 0.10, 0.18), conf.high = c(0.21, 0.02, 0.32, 0.50, 0.74), sample_size = c(120, 115, 98, 87, 83), p_value = c(0.04, 0.15, 0.29, 0.001, 0.075), section = c("Clinical", "Clinical", "Clinical", "Tumor", "Tumor") ) ggforestplot( coefs, facet = "section", facet_strip_position = "right", striped_rows = TRUE )
Use separate_groups and separate_lines when you want a more distinct visual separation between variables. This is especially useful for categorical variables with many levels. separate_groups automatically appends the variable name to the level.
block_coefs <- data.frame( term = c("race_black", "race_white", "race_other", "age", "bmi"), label = c("Black", "White", "Other", "Age", "BMI"), estimate = c(0.24, 0.08, -0.04, 0.12, -0.09), conf.low = c(0.10, -0.04, -0.18, 0.03, -0.17), conf.high = c(0.38, 0.20, 0.10, 0.21, -0.01), variable_block = c("Race", "Race", "Race", "Age", "BMI") ) ggforestplot( block_coefs, label = "label", separate_groups = "variable_block", separate_lines = TRUE, striped_rows = TRUE ) + scale_y_discrete(limits = rev(c("BMI", "Age", "Race: White", "Race: Black", "Race: Other")))
add_forest_table() allows you to attach model information to the coefficient plot. The table can be added to either the left or right side and allows for some customization. You should always add the table LAST, after styling your plot because the function calls on patchwork internally. patchwork requires specific syntax to customize plots and is generally more difficult to get working correctly.
You can choose which columns from your dataframe to include in the table using the columns argument, and can change the labels using column_labels. If some of the term labels need to be changed, use term_labels to assign them new values. Some of the column labels are automatically assigned if no value is provided.
Notice how we are explicitly naming the n and p.value columns? This is necessary in most cases because aliases are not yet incorporated (but they will be...I promise I'm getting to it).
ggforestplot( coefs, facet = "section", facet_strip_position = "right", n = "sample_size", p.value = "p_value", striped_rows = TRUE, term_labels = c("Smoking" = "Smoking status") ) + add_forest_table( columns = c("term", "sample_size", "estimate", "p_value"), column_labels = c("term" = "Variable", "sample_size" = "N", "estimate" = "Beta (95% CI)", "p_value" = "P-value") )
add_forest_table also lets you change some minor styling elements of the forest table.
ggforestplot( coefs, n = "sample_size", p.value = "p_value", striped_rows = TRUE ) + add_forest_table( position = "left", grid_lines = T, grid_line_linetype = 2, grid_line_colour = "red" )
add_split_table() can be used to create more traditional looking forest plots. You can choose which summary information goes to which side. Like add_forest_table(), it should be added after any plot-level styling.
Use the estimate_fmt argument to change how your estimates are displayed. You can also control digits via estimate_digits and interval_digits.
ggforestplot( coefs, n = "sample_size", p.value = "p_value", striped_rows = TRUE ) + scale_x_continuous(limits = c(-.8,.8)) + add_split_table( left_columns = c("term","n"), right_columns = c("estimate","p"), column_labels = c("estimate" = "Beta [95% CI]"), estimate_fmt = "{estimate} [{conf.low}, {conf.high}]", estimate_digits = 2, interval_digits = 3 )
You can use exponentiate = TRUE for models on the log-odds scale (or similar).
data(CO2) l1 <- glm(Treatment ~ conc + uptake + Type, family = binomial(link = "logit"), data = CO2)
ggforestplot(l1, exponentiate = TRUE, striped_rows = T, term_labels = c("TypeMississippi" = "Mississippi")) + add_forest_table(position = "left", show_p = F)
We can do this for survival models as well.
lung <- survival::lung lung <- lung |> dplyr::mutate( status = dplyr::recode(status, `1` = 0, `2` = 1) ) s1 <- survival::coxph(Surv(time, status) ~ sex + age + ph.karno + pat.karno, data = lung)
ggforestplot(s1, exponentiate = T, striped_rows = T) + add_forest_table()
The group argument is handy when comparing estimates from several models.
comparison_coefs <- data.frame( term = rep(c("Age", "BMI", "Smoking", "Stage II", "Stage III"), 2), estimate = c(0.12, -0.10, 0.18, 0.30, 0.46, 0.08, -0.05, 0.24, 0.40, 0.58), conf.low = c(0.03, -0.18, 0.04, 0.10, 0.18, 0.00, -0.13, 0.10, 0.20, 0.30), conf.high = c(0.21, -0.02, 0.32, 0.50, 0.74, 0.16, 0.03, 0.38, 0.60, 0.86), model = rep(c("Model A", "Model B"), each = 5), section = rep(c("Clinical", "Clinical", "Clinical", "Tumor", "Tumor"), 2) ) ggforestplot( comparison_coefs, group = "model", facet = "section", striped_rows = TRUE, dodge_width = 0.5, facet_strip_position = "right" ) + theme(legend.position = "top") + scale_color_manual(values = c("#1F968BFF", "#453781FF")) + add_forest_table()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.