gscale | R Documentation |
gscale
standardizes variables by dividing them by 2 standard
deviations and mean-centering them by default. It contains options for
handling binary variables separately. gscale()
is a fork of
rescale
from the arm
package—the key feature
difference is that gscale()
will perform the same functions for
variables in svydesign
objects. gscale()
is
also more user-friendly in that it is more flexible in how it accepts input.
gscale(
data = NULL,
vars = NULL,
binary.inputs = "center",
binary.factors = FALSE,
n.sd = 2,
center.only = FALSE,
scale.only = FALSE,
weights = NULL,
apply.weighted.contrasts = getOption("jtools-weighted.contrasts", FALSE),
x = NULL,
messages = FALSE
)
data |
A data frame or survey design. Only needed if you would like to
rescale multiple variables at once. If |
vars |
If |
binary.inputs |
Options for binary variables. Default is |
binary.factors |
Coerce two-level factors to numeric and apply scaling functions to them? Default is FALSE. |
n.sd |
By how many standard deviations should the variables be divided
by? Default for |
center.only |
A logical value indicating whether you would like to mean -center the values, but not scale them. |
scale.only |
A logical value indicating whether you would like to scale the values, but not mean-center them. |
weights |
A vector of weights equal in length to |
apply.weighted.contrasts |
Factor variables cannot be scaled, but you
can set the contrasts such that the intercept in a regression model will
reflect the true mean (assuming all other variables are centered). If set
to TRUE, the argument will apply weighted effects coding to all factors.
This is similar to the R default effects coding, but weights according to
how many observations are at each level. An adapted version of
|
x |
Deprecated. Pass numeric vectors to |
messages |
Print messages when variables are not processed due to being non-numeric or all missing? Default is FALSE. |
This function is adapted from the rescale
function of
the arm
package. It is named gscale()
after the
popularizer of this scaling method, Andrew Gelman. By default, it
works just like rescale
. But it contains many additional options and
can also accept multiple types of input without breaking a sweat.
Only numeric variables are altered when in a data.frame or survey design. Character variables, factors, etc. are skipped.
For those dealing with survey data, if you provide a survey.design
object you can rest assured that the mean-centering and scaling is performed
with help from the svymean()
and
svyvar()
functions, respectively. It was among the
primary motivations for creating this function. gscale()
will not
center or scale the weights variables defined in the survey design unless
the user specifically requests them in the x =
argument.
Jacob Long jacob.long@sc.edu
Gelman, A. (2008). Scaling regression inputs by dividing by two standard deviations. Statistics in Medicine, 27, 2865–2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf
Grotenhuis, M. te, Pelzer, B., Eisinga, R., Nieuwenhuis, R., Schmidt-Catran, A., & Konig, R. (2017). When size matters: Advantages of weighted effect coding in observational studies. International Journal of Public Health, 62, 163–167. https://doi.org/10.1007/s00038-016-0901-1 ( open access)
j_summ
is a replacement for the summary
function for
regression models. On request, it will center and/or standardize variables
before printing its output.
standardization, scaling, and centering tools
center()
,
center_mod()
,
scale_mod()
,
standardize()
x <- rnorm(10, 2, 1)
x2 <- rbinom(10, 1, .5)
# Basic use
gscale(x)
# Normal standardization
gscale(x, n.sd = 1)
# Scale only
gscale(x, scale.only = TRUE)
# Center only
gscale(x, center.only = TRUE)
# Binary inputs
gscale(x2, binary.inputs = "0/1")
gscale(x2, binary.inputs = "full") # treats it like a continous var
gscale(x2, binary.inputs = "-0.5/0.5") # keep scale, center at zero
gscale(x2, binary.inputs = "center") # mean center it
# Data frame as input
# loops through each numeric column
gscale(data = mtcars, binary.inputs = "-0.5/0.5")
# Specified vars in data frame
gscale(mtcars, vars = c("hp", "wt", "vs"), binary.inputs = "center")
# Weighted inputs
wts <- runif(10, 0, 1)
gscale(x, weights = wts)
# If using a weights column of data frame, give its name
mtcars$weights <- runif(32, 0, 1)
gscale(mtcars, weights = weights) # will skip over mtcars$weights
# If using a weights column of data frame, can still select variables
gscale(mtcars, vars = c("hp", "wt", "vs"), weights = weights)
# Survey designs
if (requireNamespace("survey")) {
library(survey)
data(api)
## Create survey design object
dstrat <- svydesign(id = ~1, strata = ~stype, weights = ~pw,
data = apistrat, fpc=~fpc)
# Creating test binary variable
dstrat$variables$binary <- rbinom(200, 1, 0.5)
gscale(data = dstrat, binary.inputs = "-0.5/0.5")
gscale(data = dstrat, vars = c("api00","meals","binary"),
binary.inputs = "-0.5/0.5")
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.