forest: Forest plot

View source: R/plot.R

forestR Documentation

Forest plot

Description

A forest plot.

The typical workflow is to create an object (e.g., glm, coxph, etc.), clean, and prepare for plot. The final steps may be performed with forest or forest2. However, these are pre-defined workflows and may not offer the same level of customization as running each step individually.

For the latter case, the following steps should be followed: cleanfp, add_reference, prepare_forest, and plot.forest; see help pages for each step. User-defined may be substituted for any step as long as the object returned has the proper elements and class.

Alternatively, forest(..., plot = FALSE) will run the preparation steps and return an object ready for plotting and further customization with the arguments in plot.forest.

Usage

forest(
  x,
  ...,
  header = FALSE,
  total = NULL,
  exclude_rows = NULL,
  space = NULL,
  plotArgs = list(),
  plot = TRUE
)

forest2(
  x,
  formula,
  data,
  header = FALSE,
  total = NULL,
  exclude_rows = NULL,
  space = NULL,
  col.group = c("grey95", "none"),
  groups = names(x),
  panel.last = NULL,
  FUN = NULL,
  ...
)

## S3 method for class 'forest'
plot(
  x,
  panel_size = c(1, 1.5, 0.8),
  col.rows = NULL,
  at.text = NULL,
  left_panel = NULL,
  center_panel = NULL,
  header = FALSE,
  font.header = 1L,
  font.labels = 1L,
  type = c("ci", "box", "tplot"),
  space = NULL,
  show_percent = TRUE,
  show_columns = TRUE,
  exclude_rows = NULL,
  ref_label = "Reference",
  ref_label_pvalue = "",
  col.ref = "darkgrey",
  col.header = "black",
  col.labels = "black",
  sig.limit = 0.05,
  col.sig = 1:2,
  names = NULL,
  col.names = "black",
  font.names = 2L,
  show_conf = FALSE,
  conf_format = "(%s, %s)",
  labels = NULL,
  xlim = NULL,
  axes = TRUE,
  logx = FALSE,
  inner.mar = c(0, 0, 0, 0),
  reset_par = TRUE,
  panel.first = NULL,
  panel.last = NULL,
  layout = c("split", "unified"),
  order = NULL,
  ...
)

Arguments

x

an object returned by prepare_forest or an object of a supported class

for forest2, a (named) list of models to be passed to forest and aggregated for a single plot

...

additional arguments passed to other methods or graphical parameters passed to par

header

logical or a vector of character strings used as labels for each variable in the model

for forest2, a list of header vectors for each model

total

optional argument to give total number of observations, useful for models which have already removed missing observations; by default, the percentages will be relative to fitted models rather than total observations; see examples

for forest2, a vector of total observations for each model

exclude_rows

optional pattern to match row labels where any rows matching will be excluded from the plot

space

optional indices to insert blank rows

plotArgs

a named list of additional arguments passed to plot.forest

plot

logical; if TRUE, a forest plot is generated; otherwise, a list of data to plot is returned (see plot.forest)

formula, data

for forest2, optional arguments to specify a formula and data set for each model; note that this is only required for some objects; see cleanfp

col.group

a vector of colors for each model in x, recycled as needed

groups

labels for each model in x

panel.last

an expression to be evaluated after plotting has taken place but before exiting the function

FUN

(experimental) apply a function to object before plotting

panel_size

proportional size of c(left, middle, right) panels

col.rows

optional vector of colors for each row background

at.text

optional x-axis locations for the text columns in normalized device coordinates; see grconvertX

left_panel

(dev) a emph named list of additional text columns added to the left panel

center_panel

(dev) optional function used to draw the plot

font.header, font.labels

font style for headers and labels

type

type of plot for middle panel (currently only "point") is supported

show_percent

logical; if TRUE, percents are shown for each row

show_columns

logical or a vector of logicals for each text column

ref_label

label for reference groups; default is "Reference"

ref_label_pvalue

label for p-value for reference groups; default is ""

col.ref, col.header, col.labels

vectors of color(s) for the reference, headers, and row labels

sig.limit

significance limit to highlight plot rows and p-values

col.sig

a vector of colors for >= sig.limit and < sig.limit, respectively, recycled as needed

names

optional vector of length 4 giving the labels for each column

col.names, font.names

color and font vectors for names, recycled as needed

show_conf

logical; if TRUE, the confidence interval is show with the estimate

conf_format

the format for the confidence intervals; default is "(%s, %s)" with placeholders for the lower and upper limits, respectively

labels

optional vector of labels for each row term; alternatively, a function modifying the default generated row labels; see examples

xlim

the x-axis limits of the plot

axes

logical; if TRUE, the x-axis is plotted (default); to show a custom axis, set reset_par = FALSE and use axis

logx

logical; if TRUE, the x-axis will be logarithmic

inner.mar

margins for the plot panel

reset_par

logical; if TRUE, the graphical parameters are reset to their state before the function call; use FALSE to continue adding to the middle panel figure

panel.first

an expression to be evaluated after the plotting window has been set up but before any plotting takes place

layout

layout of figure, either "split" where plot splits the text columns or "unified" where the text columns are adjacent

order

(experimental) manually set order for rows

See Also

summary.forest

Examples

forest(lm(mpg ~ ., mtcars), plotArgs = list(xlim = c(-5, 5), vline = 0))

library('survival')
lung2 <- within(lung, {
  sex <- factor(sex, 1:2, c('Male', 'Female'))
  ph.ecog <- factor(ph.ecog, 0:2)
  age2 <- replace(age, sample(length(age), 50), NA)
})

cx <- coxph(Surv(time, status) ~ age + sex + ph.ecog +
              I(meal.cal > 325.5) +
              I(wt.loss < 9.5) +
              I(ph.karno > 85) +
              I(pat.karno > 65), lung2)
             
x <- forest(cx, plot = FALSE)
plot(x, show_conf = TRUE)

## highlighting rows/p-values
plot(x, sig.limit = 0.2, col.sig = c('darkgrey', 'red3'))
plot(x, col.sig = 'black')

## change the p-value format
x <- forest(cx, plot = FALSE, format_pval = format.pval)
plot(x, show_conf = TRUE)

x <- forest(cx, plot = FALSE,
  format_pval = function(x) forest:::pvalr(x, digits = 3)
)
plot(x, show_conf = TRUE)


## Not run: 
## note that this works but better to use sig.limit/col.sig arguments
## instead to change the color palette
palette(c('grey70', 'green4'))
plot(x, show_conf = TRUE, cex = 3)

palette(c('black', 'black'))
plot(x, show_conf = TRUE, cex = 3)
palette('default')

## End(Not run)


## use a function to modify default row labels
plot(x, labels = function(x) gsub('^(sex|I)|[()]|(TRUE|FALSE)', '', x))
## compare to
plot(x)


## experimental options
## Not run: 
## exclude rows without affecting the model
plot(x, exclude_rows = 'ecog2')

## change the order of rows
plot(x, order = c(1:3, 5, 4, 6))

## use diamonds instead of pch
## - width of diamond is the confidence interval
plot(x, cex = 1, diamond = 5:6) ## manual
plot(x, cex = 1, diamond = function(x) grepl('ecog', x$Term)) ## programmatic

## add spacing between rows
## - this must be done during the "add_reference" step
forest(cx, space = c(2, 4, 7, 9, 11, 13, 15))
forest(cx, space = function(x) {
  print(names(x)); print(x[, 7]); c(which(!duplicated(x[, 7]))[-1], nrow(x) + 1)
})

## End(Not run)

## a useful case for exclude_rows
## note that this no longer applies -- cluster/strata are already excluded
## from the model output, see keep_strata, keep_cluster args from
## forest::add_reference
cl <- coxph(Surv(time, status) ~ ph.ecog + sex + cluster(inst), data = lung2)
forest(cl)
forest(cl, exclude_rows = 'cluster')


## add extra columns to left panel
plot(
  x, show_conf = TRUE, panel_size = c(1.5, 1.5, 1), layout = 'unified',
  left_panel = list(HR = x$cleanfp_list$numeric$`exp(coef)`)
)

plot(
  x, show_conf = TRUE, panel_size = c(2, 1.5, 1), layout = 'unified',
  left_panel = list(
    HR = x$cleanfp_list$numeric$`exp(coef)`,
    ' ' = ifelse(x$cleanfp_list$numeric$p.value < 0.05, '*', '')
  )
)


## use a custom denominator for percents, eg, if the model sample size
## excludes NAs or missing data
x <- coxph(Surv(time, status) ~ age2 + sex + ph.ecog, lung2)
forest(x, total = nrow(lung2))
## compare
forest(x)


## equivalent ways to make forest plot
x <- coxph(Surv(time, status) ~ age + sex + ph.ecog, lung2)
clean <- cleanfp(x)
clean_ref <- add_reference(clean)
prep_list <- prepare_forest(clean_ref)
plot(prep_list)

## or
forest(x, header = c('Age in years', 'Sex', 'ECOG PS'))


## plot.forest gives more options
plot(
  prep_list,
  xlim = c(0, 2),
  center_panel = {
    panel_box(replicate(6, runif(20, 0, 2), simplify = FALSE))
    axis(1, mgp = c(3, 2, 1))
  }
)

## Not run: 
library('rawr') ## for tplot
hr_ci <- prep_list$cleanfp_list$numeric[1:3]
plot(
  prep_list,
  xlim = c(0, 4),
  center_panel = {
    panel_tplot(rev(asplit(hr_ci, 1)), type = 'd', cex = 3,
                pch = c(16, 1, 1), group.pch = FALSE,
                col = c(1, 2, 2), group.col = FALSE)
    axis(1, mgp = c(3, 2, 1))
  }
)

## End(Not run)


## use forest(..., plot = FALSE) to run steps except plotting which
## can then be customized with plot.forest
fp <- forest(glm(vs ~ factor(gear) + wt + hp, mtcars,
                 family = 'binomial'),
             plot = FALSE)
plot(fp, labels = c(paste('Gear -', 3:5), 'Weight (1k lbs)', 'Horse Power'))


## multiple models on the same plot
models <- list(
  'Model 1' = coxph(Surv(time, status) ~ age + sex + ph.ecog, lung2),
  'Model 2' = coxph(Surv(time, status) ~ age + sex, lung2),
  'Model 3' = coxph(Surv(time, status) ~ age, lung2)
)

prep_lists <- lapply(models, function(x) {
  x <- forest(x, plot = FALSE)
  structure(x[[1L]], class = 'cleanfp_list')
})

group.col <- rep_len(c('grey95', 'none'), length(models))
group.col <- rep(group.col, sapply(prep_lists, function(x) length(x$Term)))

x <- Reduce(merge_forest, prep_lists)
op <- par(no.readonly = TRUE)
plot(x, col.rows = group.col, reset_par = FALSE)
rl <- rev(rle(group.col)$lengths)
yy <- rev(cumsum(head(c(0, rl), -1)) + rl / 2) + 0.5
text(grconvertX(0.02, 'ndc'), yy, names(models),
     xpd = NA, srt = 90, adj = 0.5)
par(op)

## or simply
forest2(models)

## same but with headers/totals for each model
forest2(
  models, total = c(230, 240, 250),
  labels = function(x) gsub('sex|ph\\.ecog', '', x),
  header = list(c('Age', 'Sex', 'ECOG PS'), c('Age', 'Sex'), 'Age')
)


## other supported objects:
## crr, crr2, coxph, coxphf, logistf, formula

## competing risks regressions
library('cmprsk')
x <- with(transplant,
  crr(futime, as.integer(event) - 1,
      cov1 = cbind(age, model.matrix(~ sex + abo)[, -1, drop = FALSE])))
forest(x, ~ age + sex + abo, data = transplant)

library('cmprsk2')
x <- crr2(Surv(futime, event(censored) == death) ~ age + sex + abo, transplant)
forest(x)


## cox models
dat <- within(na.omit(transplant), {
  event_ind <- +(event == 'death')
  futime <- futime + 1e-8
})

## survival::coxph model
x <- coxph(Surv(futime, event_ind) ~ age + sex + abo, dat)
x <- forest(x, plot = FALSE)
plot(x, show_conf = TRUE, xlim = c(0, 5))


## coxphf::coxphf model
library('coxphf')
x <- coxphf(Surv(futime, event_ind) ~ age + sex + abo, dat)
forest(x, data = dat)


## logistf::logistf model
library('logistf')
x <- logistf(event_ind ~ age + sex + abo, dat)
forest(x, data = dat, plotArgs = list(show_conf = TRUE))


## brglm2
library('brglm2')
x <- glm(event_ind ~ age + sex + abo, dat, family = binomial('logit'),
         method = 'brglmFit', type = 'AS_mixed')
forest(x, plotArgs = list(show_conf = TRUE))

x <- glm(event_ind ~ age + sex + abo, dat, family = binomial('logit'),
         method = 'brglmFit', type = 'MPL_Jeffreys')
forest(x, plotArgs = list(show_conf = TRUE))


## odds ratios/fisher tests
dat <- data.frame(
  group = c('tx', 'pbo'),
  mut = replicate(10, sample(0:1, 50, replace = TRUE))
)
names(dat)[-1] <- paste0('gene', 1:10)

forest(
  group ~ ., dat,
  plotArgs = list(xlim = c(0, 20), show_conf = TRUE, cex = 1)
)

## excluding reference level
forest(
  group ~ ., dat, exclude_rows = '0$',
  plotArgs = list(xlim = c(0, 20), show_conf = TRUE, cex = 1)
)


raredd/forest documentation built on Feb. 19, 2024, 9:22 p.m.