Function for Standardizing Regression Predictors by Centering and Dividing by 2 sd's

Description

Numeric variables that take on more than two values are each rescaled to have a mean of 0 and a sd of 0.5; Binary variables are rescaled to have a mean of 0 and a difference of 1 between their two categories; Non-numeric variables that take on more than two values are unchanged; Variables that take on only one value are unchanged

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## S4 method for signature 'lm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'glm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'merMod'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'polr'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")

Arguments

object

an object of class lm or glm

unchanged

vector of names of parameters to leave unstandardized

standardize.y

if TRUE, the outcome variable is standardized also

binary.inputs

options for standardizing binary variables

Details

"0/1" (rescale so that the lower value is 0 and the upper is 1) "-0.5/0.5" (rescale so that the lower value is -0.5 and upper is 0.5) "center" (rescale so that the mean of the data is 0 and the difference between the two categories is 1) "full" (rescale by subtracting the mean and dividing by 2 sd's) "leave.alone" (do nothing)

Author(s)

Andrew Gelman gelman@stat.columbia.edu Yu-Sung Su suyusung@tsinghua.edu.cn

References

Andrew Gelman. (2008). “Scaling regression inputs by dividing by two standard deviations.” Statistics in Medicine 27: 2865–2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf

See Also

rescale

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  # Set up the fake data
  n <- 100
  x <- rnorm (n, 2, 1)
  x1 <- rnorm (n)
  x1 <- (x1-mean(x1))/(2*sd(x1))   # standardization
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
  y2 <- sample(1:5, n, replace=TRUE)
  M1 <- glm (y ~ x, family=binomial(link="logit"))
  display(M1)
  M1.1 <- glm (y ~ rescale(x), family=binomial(link="logit"))
  display(M1.1)
  M1.2 <- standardize(M1.1)
  display(M1.2)
  # M1.1 & M1.2 should be the same
  M2 <- polr(ordered(y2) ~ x)
  display(M2)
  M2.1 <- polr(ordered(y2) ~ rescale(x))
  display(M2.1)
  M2.2 <- standardize(M2.1)
  display(M2.2)
  # M2.1 & M2.2 should be the same