The standardize
package provides tools for standardizing variables
prior to regression (i.e. placing all of the variables to be used in a
regression on similar scales).
When all of the predictors in a regression are on a similar scale, it makes
the interpretation of their effect sizes more comparable. In the case of
gaussian regression, placing the response on unit scale also eases
interpretation. Standardizing regression variables also has computational
benefits in the case of mixed effects regressions, and makes determining
reasonable priors in Bayesian regressions simpler. To view the package
vignette, call vignette("using-standardize", package = "standardize")
.
To see the version history, call standardize.news()
.
The named_contr_sum
function gives named sum contrasts to
unordered factors, and allows the absolute value of the non-zero cells in
contrast matrix to be specified through its scale
argument. The
scaled_contr_poly
function gives orthogonal polynomial
contrasts to ordered factors, and allows the standard deviation of the
columns in the contrast matrix to be specified through its scale
argument. The scale_by
function allows numeric variables
to be scaled conditioning on factors, such that the numeric variable has
the same mean and standard deviation within each level of a factor (or the
interaction of several factors), with the standard deviation specified
through its scale
argument.
The standardize
function creates a
standardized
object whose elements
can be used in regression fitting functions, ensuring
that all of the predictors are on the
same scale. This is done by passing the function's scale
argument
to named_contr_sum
for all unordered factors (and also
any predictor with only two unique values regardless of its original class),
to scaled_contr_poly
for all ordered factors, and to
scale_by
for numeric variables which contain calls to the
function. For numeric predictors not contained in a scale_by
call, scale
is called, ensuring that the result has
standard deviation equal to the scale
argument to
standardize
. Gaussian responses are always placed on
unit scale, using scale
(or scale_by
if
the function was used on the left hand side of the regression formula).
Offsets for gaussian models are divided by the standard deviation of the
raw response (within-factor-level if scale_by
is used on
the response).
Christopher D. Eager <eager.stats@gmail.com>
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.