standardize: Function for Standardizing Regression Predictors by Centering...

Description Usage Arguments Details Author(s) References See Also Examples

Description

Numeric variables that take on more than two values are each rescaled to have a mean of 0 and a sd of 0.5; Binary variables are rescaled to have a mean of 0 and a difference of 1 between their two categories; Non-numeric variables that take on more than two values are unchanged; Variables that take on only one value are unchanged

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## S4 method for signature 'lm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'glm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'merMod'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'polr'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")

Arguments

object

an object of class lm or glm

unchanged

vector of names of parameters to leave unstandardized

standardize.y

if TRUE, the outcome variable is standardized also

binary.inputs

options for standardizing binary variables

Details

"0/1" (rescale so that the lower value is 0 and the upper is 1) "-0.5/0.5" (rescale so that the lower value is -0.5 and upper is 0.5) "center" (rescale so that the mean of the data is 0 and the difference between the two categories is 1) "full" (rescale by subtracting the mean and dividing by 2 sd's) "leave.alone" (do nothing)

Author(s)

Andrew Gelman [email protected] Yu-Sung Su [email protected]

References

Andrew Gelman. (2008). “Scaling regression inputs by dividing by two standard deviations.” Statistics in Medicine 27: 2865–2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf

See Also

rescale

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  # Set up the fake data
  n <- 100
  x <- rnorm (n, 2, 1)
  x1 <- rnorm (n)
  x1 <- (x1-mean(x1))/(2*sd(x1))   # standardization
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
  y2 <- sample(1:5, n, replace=TRUE)
  M1 <- glm (y ~ x, family=binomial(link="logit"))
  display(M1)
  M1.1 <- glm (y ~ rescale(x), family=binomial(link="logit"))
  display(M1.1)
  M1.2 <- standardize(M1.1)
  display(M1.2)
  # M1.1 & M1.2 should be the same
  M2 <- polr(ordered(y2) ~ x)
  display(M2)
  M2.1 <- polr(ordered(y2) ~ rescale(x))
  display(M2.1)
  M2.2 <- standardize(M2.1)
  display(M2.2)
  # M2.1 & M2.2 should be the same

Example output

Loading required package: MASS
Loading required package: Matrix
Loading required package: lme4

arm (Version 1.9-3, built: 2016-11-21)

Working directory is /work/tmp

glm(formula = y ~ x, family = binomial(link = "logit"))
            coef.est coef.se
(Intercept) 1.53     0.68   
x           0.29     0.33   
---
  n = 100, k = 2
  residual deviance = 68.5, null deviance = 69.3 (difference = 0.8)
glm(formula = y ~ rescale(x), family = binomial(link = "logit"))
            coef.est coef.se
(Intercept) 2.12     0.33   
rescale(x)  0.57     0.66   
---
  n = 100, k = 2
  residual deviance = 68.5, null deviance = 69.3 (difference = 0.8)
glm(formula = y ~ rescale(z.x), family = binomial(link = "logit"))
             coef.est coef.se
(Intercept)  2.12     0.33   
rescale(z.x) 0.57     0.66   
---
  n = 100, k = 2
  residual deviance = 68.5, null deviance = 69.3 (difference = 0.8)

Re-fitting to get Hessian

polr(formula = ordered(y2) ~ x)
    coef.est coef.se
x    0.12     0.18  
1|2 -1.35     0.45  
2|3 -0.43     0.42  
3|4  0.60     0.42  
4|5  1.45     0.45  
---
n = 100, k = 5 (including 4 intercepts)
residual deviance = 318.7, null deviance is not computed by polr

Re-fitting to get Hessian

polr(formula = ordered(y2) ~ rescale(x))
           coef.est coef.se
rescale(x)  0.23     0.36  
1|2        -1.59     0.27  
2|3        -0.67     0.21  
3|4         0.36     0.20  
4|5         1.21     0.24  
---
n = 100, k = 5 (including 4 intercepts)
residual deviance = 318.7, null deviance is not computed by polr

Re-fitting to get Hessian

polr(formula = ordered(y2) ~ rescale(z.x))
             coef.est coef.se
rescale(z.x)  0.23     0.36  
1|2          -1.59     0.27  
2|3          -0.67     0.21  
3|4           0.36     0.20  
4|5           1.21     0.24  
---
n = 100, k = 5 (including 4 intercepts)
residual deviance = 318.7, null deviance is not computed by polr

arm documentation built on May 31, 2017, 3:34 a.m.

Related to standardize in arm...