by_2sd: Rescale regression results by multiplying by 2 standard...
In dotwhisker: Dot-and-Whisker Plots of Regression Results

View source: R/by_2sd.R

by_2sd

R Documentation

Rescale regression results by multiplying by 2 standard deviations

Description

by_2sd rescales regression results to facilitate making dot-and-whisker plots using dwplot.

Usage

by_2sd(df, dataset)

Arguments

`df`	A data frame including the variables `term` (names of independent variables), `estimate` (corresponding coefficient estimates), `std.error` (corresponding standard errors), and optionally `model` (when multiple models are desired on a single plot) such as generated those by `tidy`.
`dataset`	The data analyzed in the models whose results are recorded in `df`, or (preferably) the model matrix used by the models in `df`; the information required for complex models can more easily be generated from the model matrix than from the original data set. In many cases the model matrix can be extracted from the original model via `model.matrix`.

Details

by_2sd multiplies the results from regression models saved as tidy data frames for predictors that are not binary by twice the standard deviation of these variables in the dataset analyzed. Standardizing in this way yields coefficients that are directly comparable to each other and to those for untransformed binary predictors (Gelman 2008) and so facilitates plotting using dwplot. Note that the current version of by_2sd does not subtract the mean (in contrast to Gelman's (2008) formula). However, all estimates and standard errors of the independent variables are the same as if the mean was subtracted. The only difference from Gelman (2008) is that for all variables in the model the intercept is shifted by the coefficient times the mean of the variable.

An alternative available in some circumstances is to pass a model object to arm::standardize before passing the results to tidy and then on to dwplot. The advantages of by_2sd are that (1) it takes a tidy data frame as its input and so is not restricted to only those model objects that standardize accepts and (2) it is much more efficient because it operates on the parameters rather than refitting the original model with scaled data.

Value

A tidy data frame

References

Gelman, Andrew. 2008. "Scaling Regression Inputs by Dividing by Two Standard Deviations." Statistics in Medicine, 27:2865-2873.

Examples

library(broom)
library(dplyr)

data(mtcars)
m1 <- lm(mpg ~ wt + cyl + disp, data = mtcars)
m1_df <- tidy(m1) %>% by_2sd(mtcars) # create data frame of rescaled regression results

dotwhisker documentation built on June 8, 2025, 1:08 p.m.