There are several R functions out there that will return \'standardized coefficients\' of one sort or another. The capabilities and assumptions of these functions differ on
They also differ in what kind of data they return: just coefficients,
full lm
objects, or something else.
Reviewed here are the following functions:
Three of these functions work fine for additive effects models - models with no interaction or polynomial terms. However, they fail to produce the expected results where higher order terms are present. Two of these produce valid models, in the sense that they produce the correct predicted values and residuals. However, they are difficult to interpret when the model includes higher order terms, because similar-looking variables are not on the same scales.
Package::functions that return standardized coefficients:
Standardized coefficients are usually described as the coefficients estimated for a model when all of the variables have been standardized. There are some variations on this basic idea, but they all begin by first - in principle - standardizing the variables.
I stipulate \"in principle\" because in some cases you do not actually have to transform your data before estimating a model. Where your model consists of only first-order, continuous variables, all you need are the standard deviations of each of your variables to calculate standardized coefficients from the unstandardized coefficients (coefficients in your data\'s original units).
The shortcut formula for transforming coefficients, usually taught in introductory regression courses, is $$\beta = \sigma_x/\sigma_y \times b$$, where $b$ is an unstandardized coefficient, $\sigma_x$ is the standard deviation of the independent variable in question, and $\sigma_y$ is the standard deviation of the dependent variable.
In quite a bit of statistical software - SAS, Stata, SPSS, and the first three R functions listed above - this formula is blindly applied to models with higher order terms as well, that is, with interaction and polynomial terms. While these coefficients are useful for many purposes, they are not consistent with the basic idea of standardized coefficients, because they are not the coefficients one would estimate were the data transformed first. Just what they do represent is convoluted.
Let\'s start with R\'s common mtcars
data, and estimate a model
where these functions fail.
# First, calculate by "hand" zmpg <- scale(mtcars$mpg) zdisp <- scale(mtcars$disp) zwt <- scale(mtcars$wt) coefficients(lm(zmpg ~ zdisp*zwt)) # Now, fit an unstandardized model fit <- lm(mpg ~ disp*wt, mtcars) # And apply the available functions QuantPsyc::lm.beta(fit) # warning given, wrong results lm.beta::lm.beta(fit) lsr::standardCoefs(fit) # agrees with lm.beta::lm.beta, but no intercept arm::standardize(fit, standardize.y=TRUE) # uses 2 sd reghelper::beta(fit) # Correct! effectsize::standardize(fit) # also correct!
The QuantPsyc
function uses the classic formula, but
is implemented in a way that produces a warning with
higher order terms. The lm.beta
and lsr
functions also apply
the classic shortcut formula, which is at odds with the concept.
The arm
function insists on scaling by 2 standard deviations,
an alternative concept.
So none of these really returns the coefficients we want.
stdBeta::stdBeta(fit)
This gives us the coefficients we would gotten \'by hand\'.
R\'s formula specification does not allow polynomial terms to be formed as tensor products like other statistical software, which leaves three commonly used methods for specifying polynomial regressions. The first is to \"inhibit\" interpretation of the exponentiation operator.
coefficients(lm(zmpg ~ zwt + I(zwt^2))) # by "hand" fitsq <- lm(mpg ~ wt + I(wt^2), mtcars) stdBeta::stdBeta(fitsq) reghelper::beta(fitsq) # Correct! effectsize::standardize(fitsq) # also correct!
The second method would be to use the poly
function.
coefficients(lm(zmpg ~ zwt + I(zwt^2))) # by "hand" fitpoly <- lm(mpg ~ poly(wt,2,raw=TRUE), data=mtcars) stdBeta::stdBeta(fitpoly) reghelper::beta(fitpoly) # Error effectsize::standardize(fitpoly) # also correct!
mtcars$am <- factor(mtcars$am) coefficients(lm(zmpg ~ zwt*am, data=mtcars)) # by "hand" fitfactor <- lm(mpg ~ wt*am, data=mtcars) stdBeta::stdBeta(fitfactor) reghelper::beta(fitfactor) # scales factors by default reghelper::beta(fitfactor, skip="am") # Correct effectsize::standardize(fitfactor) # also correct!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.