forest: Forest plot
In raredd/forest: Forest plots

View source: R/plot.R

forest

R Documentation

Forest plot

Description

A forest plot.

The typical workflow is to create an object (e.g., glm, coxph, etc.), clean, and prepare for plot. The final steps may be performed with forest or forest2. However, these are pre-defined workflows and may not offer the same level of customization as running each step individually.

For the latter case, the following steps should be followed: cleanfp, add_reference, prepare_forest, and plot.forest; see help pages for each step. User-defined may be substituted for any step as long as the object returned has the proper elements and class.

Alternatively, forest(..., plot = FALSE) will run the preparation steps and return an object ready for plotting and further customization with the arguments in plot.forest.

Usage

forest(
  x,
  ...,
  header = FALSE,
  total = NULL,
  exclude_rows = NULL,
  space = NULL,
  plotArgs = list(),
  plot = TRUE
)

forest2(
  x,
  formula,
  data,
  header = FALSE,
  total = NULL,
  exclude_rows = NULL,
  space = NULL,
  col.group = c("grey95", "none"),
  groups = names(x),
  panel.last = NULL,
  FUN = NULL,
  ...
)

## S3 method for class 'forest'
plot(
  x,
  panel_size = c(1, 1.5, 0.8),
  col.rows = NULL,
  at.text = NULL,
  left_panel = NULL,
  center_panel = NULL,
  header = FALSE,
  font.header = 1L,
  font.labels = 1L,
  type = c("ci", "box", "tplot"),
  space = NULL,
  show_percent = TRUE,
  show_columns = TRUE,
  exclude_rows = NULL,
  ref_label = "Reference",
  ref_label_pvalue = "",
  col.ref = "darkgrey",
  col.header = "black",
  col.labels = "black",
  sig.limit = 0.05,
  col.sig = 1:2,
  names = NULL,
  col.names = "black",
  font.names = 2L,
  show_conf = FALSE,
  conf_format = "(%s, %s)",
  labels = NULL,
  xlim = NULL,
  axes = TRUE,
  logx = FALSE,
  inner.mar = c(0, 0, 0, 0),
  reset_par = TRUE,
  panel.first = NULL,
  panel.last = NULL,
  layout = c("split", "unified"),
  order = NULL,
  ...
)

Arguments

`x`	an object returned by `prepare_forest` or an object of a supported class for `forest2`, a (named) list of models to be passed to `forest` and aggregated for a single plot
`...`	additional arguments passed to other methods or graphical parameters passed to `par`
`header`	logical or a vector of character strings used as labels for each variable in the model for `forest2`, a list of header vectors for each model
`total`	optional argument to give total number of observations, useful for models which have already removed missing observations; by default, the percentages will be relative to fitted models rather than total observations; see examples for `forest2`, a vector of total observations for each model
`exclude_rows`	optional pattern to match row labels where any rows matching will be excluded from the plot
`space`	optional indices to insert blank rows
`plotArgs`	a named list of additional arguments passed to `plot.forest`
`plot`	logical; if `TRUE`, a forest plot is generated; otherwise, a list of data to plot is returned (see `plot.forest`)
`formula, data`	for `forest2`, optional arguments to specify a formula and data set for each model; note that this is only required for some objects; see `cleanfp`
`col.group`	a vector of colors for each model in `x`, recycled as needed
`groups`	labels for each model in `x`
`panel.last`	an expression to be evaluated after plotting has taken place but before exiting the function
`FUN`	(experimental) apply a function to object before plotting
`panel_size`	proportional size of `c(left, middle, right)` panels
`col.rows`	optional vector of colors for each row background
`at.text`	optional x-axis locations for the text columns in normalized device coordinates; see `grconvertX`
`left_panel`	(dev) a `emph` named list of additional text columns added to the left panel
`center_panel`	(dev) optional function used to draw the plot
`font.header, font.labels`	font style for headers and labels
`type`	type of plot for middle panel (currently only `"point"`) is supported
`show_percent`	logical; if `TRUE`, percents are shown for each row
`show_columns`	logical or a vector of logicals for each text column
`ref_label`	label for reference groups; default is `"Reference"`
`ref_label_pvalue`	label for p-value for reference groups; default is `""`
`col.ref, col.header, col.labels`	vectors of color(s) for the reference, headers, and row labels
`sig.limit`	significance limit to highlight plot rows and p-values
`col.sig`	a vector of colors for `>= sig.limit` and `< sig.limit`, respectively, recycled as needed
`names`	optional vector of length 4 giving the labels for each column
`col.names, font.names`	color and font vectors for `names`, recycled as needed
`show_conf`	logical; if `TRUE`, the confidence interval is show with the estimate
`conf_format`	the format for the confidence intervals; default is `"(%s, %s)"` with placeholders for the lower and upper limits, respectively
`labels`	optional vector of labels for each row term; alternatively, a function modifying the default generated row labels; see examples
`xlim`	the x-axis limits of the plot
`axes`	logical; if `TRUE`, the x-axis is plotted (default); to show a custom axis, set `reset_par = FALSE` and use `axis`
`logx`	logical; if `TRUE`, the x-axis will be logarithmic
`inner.mar`	margins for the plot panel
`reset_par`	logical; if `TRUE`, the graphical parameters are reset to their state before the function call; use `FALSE` to continue adding to the middle panel figure
`panel.first`	an expression to be evaluated after the plotting window has been set up but before any plotting takes place
`layout`	layout of figure, either `"split"` where plot splits the text columns or `"unified"` where the text columns are adjacent
`order`	(experimental) manually set order for rows

Examples

forest(lm(mpg ~ ., mtcars), plotArgs = list(xlim = c(-5, 5), vline = 0))

library('survival')
lung2 <- within(lung, {
  sex <- factor(sex, 1:2, c('Male', 'Female'))
  ph.ecog <- factor(ph.ecog, 0:2)
  age2 <- replace(age, sample(length(age), 50), NA)
})

cx <- coxph(Surv(time, status) ~ age + sex + ph.ecog +
              I(meal.cal > 325.5) +
              I(wt.loss < 9.5) +
              I(ph.karno > 85) +
              I(pat.karno > 65), lung2)
             
x <- forest(cx, plot = FALSE)
plot(x, show_conf = TRUE)

## highlighting rows/p-values
plot(x, sig.limit = 0.2, col.sig = c('darkgrey', 'red3'))
plot(x, col.sig = 'black')

## change the p-value format
x <- forest(cx, plot = FALSE, format_pval = format.pval)
plot(x, show_conf = TRUE)

x <- forest(cx, plot = FALSE,
  format_pval = function(x) forest:::pvalr(x, digits = 3)
)
plot(x, show_conf = TRUE)


## Not run: 
## note that this works but better to use sig.limit/col.sig arguments
## instead to change the color palette
palette(c('grey70', 'green4'))
plot(x, show_conf = TRUE, cex = 3)

palette(c('black', 'black'))
plot(x, show_conf = TRUE, cex = 3)
palette('default')

## End(Not run)


## use a function to modify default row labels
plot(x, labels = function(x) gsub('^(sex|I)|[()]|(TRUE|FALSE)', '', x))
## compare to
plot(x)


## experimental options
## Not run: 
## exclude rows without affecting the model
plot(x, exclude_rows = 'ecog2')

## change the order of rows
plot(x, order = c(1:3, 5, 4, 6))

## use diamonds instead of pch
## - width of diamond is the confidence interval
plot(x, cex = 1, diamond = 5:6) ## manual
plot(x, cex = 1, diamond = function(x) grepl('ecog', x$Term)) ## programmatic

## add spacing between rows
## - this must be done during the "add_reference" step
forest(cx, space = c(2, 4, 7, 9, 11, 13, 15))
forest(cx, space = function(x) {
  print(names(x)); print(x[, 7]); c(which(!duplicated(x[, 7]))[-1], nrow(x) + 1)
})

## End(Not run)

## a useful case for exclude_rows
## note that this no longer applies -- cluster/strata are already excluded
## from the model output, see keep_strata, keep_cluster args from
## forest::add_reference
cl <- coxph(Surv(time, status) ~ ph.ecog + sex + cluster(inst), data = lung2)
forest(cl)
forest(cl, exclude_rows = 'cluster')


## add extra columns to left panel
plot(
  x, show_conf = TRUE, panel_size = c(1.5, 1.5, 1), layout = 'unified',
  left_panel = list(HR = x$cleanfp_list$numeric$`exp(coef)`)
)

plot(
  x, show_conf = TRUE, panel_size = c(2, 1.5, 1), layout = 'unified',
  left_panel = list(
    HR = x$cleanfp_list$numeric$`exp(coef)`,
    ' ' = ifelse(x$cleanfp_list$numeric$p.value < 0.05, '*', '')
  )
)


## use a custom denominator for percents, eg, if the model sample size
## excludes NAs or missing data
x <- coxph(Surv(time, status) ~ age2 + sex + ph.ecog, lung2)
forest(x, total = nrow(lung2))
## compare
forest(x)


## equivalent ways to make forest plot
x <- coxph(Surv(time, status) ~ age + sex + ph.ecog, lung2)
clean <- cleanfp(x)
clean_ref <- add_reference(clean)
prep_list <- prepare_forest(clean_ref)
plot(prep_list)

## or
forest(x, header = c('Age in years', 'Sex', 'ECOG PS'))


## plot.forest gives more options
plot(
  prep_list,
  xlim = c(0, 2),
  center_panel = {
    panel_box(replicate(6, runif(20, 0, 2), simplify = FALSE))
    axis(1, mgp = c(3, 2, 1))
  }
)

## Not run: 
library('rawr') ## for tplot
hr_ci <- prep_list$cleanfp_list$numeric[1:3]
plot(
  prep_list,
  xlim = c(0, 4),
  center_panel = {
    panel_tplot(rev(asplit(hr_ci, 1)), type = 'd', cex = 3,
                pch = c(16, 1, 1), group.pch = FALSE,
                col = c(1, 2, 2), group.col = FALSE)
    axis(1, mgp = c(3, 2, 1))
  }
)

## End(Not run)


## use forest(..., plot = FALSE) to run steps except plotting which
## can then be customized with plot.forest
fp <- forest(glm(vs ~ factor(gear) + wt + hp, mtcars,
                 family = 'binomial'),
             plot = FALSE)
plot(fp, labels = c(paste('Gear -', 3:5), 'Weight (1k lbs)', 'Horse Power'))


## multiple models on the same plot
models <- list(
  'Model 1' = coxph(Surv(time, status) ~ age + sex + ph.ecog, lung2),
  'Model 2' = coxph(Surv(time, status) ~ age + sex, lung2),
  'Model 3' = coxph(Surv(time, status) ~ age, lung2)
)

prep_lists <- lapply(models, function(x) {
  x <- forest(x, plot = FALSE)
  structure(x[[1L]], class = 'cleanfp_list')
})

group.col <- rep_len(c('grey95', 'none'), length(models))
group.col <- rep(group.col, sapply(prep_lists, function(x) length(x$Term)))

x <- Reduce(merge_forest, prep_lists)
op <- par(no.readonly = TRUE)
plot(x, col.rows = group.col, reset_par = FALSE)
rl <- rev(rle(group.col)$lengths)
yy <- rev(cumsum(head(c(0, rl), -1)) + rl / 2) + 0.5
text(grconvertX(0.02, 'ndc'), yy, names(models),
     xpd = NA, srt = 90, adj = 0.5)
par(op)

## or simply
forest2(models)

## same but with headers/totals for each model
forest2(
  models, total = c(230, 240, 250),
  labels = function(x) gsub('sex|ph\\.ecog', '', x),
  header = list(c('Age', 'Sex', 'ECOG PS'), c('Age', 'Sex'), 'Age')
)


## other supported objects:
## crr, crr2, coxph, coxphf, logistf, formula

## competing risks regressions
library('cmprsk')
x <- with(transplant,
  crr(futime, as.integer(event) - 1,
      cov1 = cbind(age, model.matrix(~ sex + abo)[, -1, drop = FALSE])))
forest(x, ~ age + sex + abo, data = transplant)

library('cmprsk2')
x <- crr2(Surv(futime, event(censored) == death) ~ age + sex + abo, transplant)
forest(x)


## cox models
dat <- within(na.omit(transplant), {
  event_ind <- +(event == 'death')
  futime <- futime + 1e-8
})

## survival::coxph model
x <- coxph(Surv(futime, event_ind) ~ age + sex + abo, dat)
x <- forest(x, plot = FALSE)
plot(x, show_conf = TRUE, xlim = c(0, 5))


## coxphf::coxphf model
library('coxphf')
x <- coxphf(Surv(futime, event_ind) ~ age + sex + abo, dat)
forest(x, data = dat)


## logistf::logistf model
library('logistf')
x <- logistf(event_ind ~ age + sex + abo, dat)
forest(x, data = dat, plotArgs = list(show_conf = TRUE))


## brglm2
library('brglm2')
x <- glm(event_ind ~ age + sex + abo, dat, family = binomial('logit'),
         method = 'brglmFit', type = 'AS_mixed')
forest(x, plotArgs = list(show_conf = TRUE))

x <- glm(event_ind ~ age + sex + abo, dat, family = binomial('logit'),
         method = 'brglmFit', type = 'MPL_Jeffreys')
forest(x, plotArgs = list(show_conf = TRUE))


## odds ratios/fisher tests
dat <- data.frame(
  group = c('tx', 'pbo'),
  mut = replicate(10, sample(0:1, 50, replace = TRUE))
)
names(dat)[-1] <- paste0('gene', 1:10)

forest(
  group ~ ., dat,
  plotArgs = list(xlim = c(0, 20), show_conf = TRUE, cex = 1)
)

## excluding reference level
forest(
  group ~ ., dat, exclude_rows = '0$',
  plotArgs = list(xlim = c(0, 20), show_conf = TRUE, cex = 1)
)

raredd/forest documentation built on Feb. 19, 2024, 9:22 p.m.