relimp: Relative Importance of Predictors in a Regression Model

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Produces a summary of the relative importance of two predictors or two sets of predictors in a fitted model object.

Usage

1
2
3
4
5
relimp(object, set1=NULL,  set2=NULL, label1="set1", label2="set2", 
          subset=TRUE, 
          response.cat=NULL, ...)
## S3 method for class 'relimp'
print(x, digits=3, ...)

Arguments

object

A model object of class lm, glm, coxph, survreg, multinom, polr or gls

set1

An index or vector of indices for the effects to be included in the numerator of the comparison

set2

An index or vector of indices for the effects to be included in the denominator of the comparison

label1

A character string; mnemonic name for the variables in set1

label2

A character string; mnemonic name for the variables in set2

subset

Either a vector of numeric indices for the cases to be included in the standardization of effects, or a vector of logicals (TRUE for inclusion) whose length is the same as the number of rows in the model frame, object$model. The default choice is to include all cases in the model frame.

response.cat

If object is of class multinom, this is a character string used to specify which regression is of interest (i.e., the regression which predicts the log odds on response cat versus the model's reference category). The response.cat argument should be an element of object$lab; or NULL if object is not of class multinom.

...

For models of class glm, one may additionally set the dispersion parameter for the family (for example, dispersion=1.69). By default it is obtained from object. Supplying it here permits explicit allowance for over-dispersion, for example.

x

an object of class relimp

digits

The number of decimal places to be used in the printed summary. Default is 3.

Details

If set1 and set2 both have length 1, relative importance is measured by the ratio of the two standardized coefficients. Equivalently this is the ratio of the standard deviations of the two contributions to the linear predictor, and this provides the generalization to comparing two sets rather than just a pair of predictors.

The computed ratio is the square root of the variance-ratio quantity denoted as ‘omega’ in Silber, J H, Rosenbaum, P R and Ross, R N (1995). Estimated standard errors are calculated by the delta method, as described in that paper for example.

If set1 and set2 are unspecified, and if the tcltk package has been loaded, a dialog box is provided (by a call to pickFrom) for the choice of set1 and set2 from the available model coefficients.

Value

An object of class relimp, with at least the following components:

model

The call used to construct the model object summarized

sets

The two sets of indices specified as arguments

log.ratio

The natural logarithm of the ratio of effect standard deviations corresponding to the two sets specified

se.log.ratio

An estimated standard error for log.ratio

If dispersion was supplied as an argument, its value is stored as the dispersion component of the resultant object.

Author(s)

David Firth d.firth@warwick.ac.uk

References

Silber, J. H., Rosenbaum, P. R. and Ross, R N (1995) Comparing the Contributions of Groups of Predictors: Which Outcomes Vary with Hospital Rather than Patient Characteristics? JASA 90, 7–18.

See Also

relrelimp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
set.seed(182)  ## an arbitrary number, just for reproducibility
x <- rnorm(100)
z <- rnorm(100)
w <- rnorm(100)
y <- 3 + (2 * x) + z + w + rnorm(100)
test <- lm(y ~ x + z + w)
print(test)
relimp(test, 2, 3)    #  compares effects of x and z
relimp(test, 2, 3:4)  #  compares effect of x with that of (z,w) combined
##
##  Data on housing and satisfaction, from Venables and Ripley
##  -- multinomial logit model
library(MASS)
library(nnet)
data(housing)
house.mult <- multinom(Sat ~ Infl + Type + Cont, weights = Freq,
  data = housing)
relimp(house.mult, set1 = 2:3, set2 = 7, response.cat = "High")

Example output

Call:
lm(formula = y ~ x + z + w)

Coefficients:
(Intercept)            x            z            w  
     3.0923       1.8040       0.9531       1.0088  


Relative importance summary for model
    lm(formula = y ~ x + z + w)

       Numerator effects ("set1")      Denominator effects ("set2") 
1                               x                                 z 

Ratio of effect standard deviations: 1.765
Log(sd ratio):                 0.568   (se 0.106)

Approximate 95% confidence interval for log(sd ratio): (0.36,0.776)
Approximate 95% confidence interval for sd ratio:      (1.434,2.173)
Warning message:
In 1.96 * (object$se.log.ratio) * c(-1, 1) :
  Recycling array of length 1 in array-vector arithmetic is deprecated.
  Use c() or as.vector() instead.


Relative importance summary for model
    lm(formula = y ~ x + z + w)

       Numerator effects ("set1")      Denominator effects ("set2") 
1                               x                                 z 
2                                                                 w 

Ratio of effect standard deviations: 1.349
Log(sd ratio):                 0.299   (se 0.085)

Approximate 95% confidence interval for log(sd ratio): (0.132,0.466)
Approximate 95% confidence interval for sd ratio:      (1.141,1.594)
Warning message:
In 1.96 * (object$se.log.ratio) * c(-1, 1) :
  Recycling array of length 1 in array-vector arithmetic is deprecated.
  Use c() or as.vector() instead.

# weights:  24 (14 variable)
initial  value 1846.767257 
iter  10 value 1747.045232
final  value 1735.041933 
converged

Relative importance summary for model
    multinom(formula = Sat ~ Infl + Type + Cont, data = housing, 
    weights = Freq)
response category High 


       Numerator effects ("set1")      Denominator effects ("set2") 
1                      InflMedium                          ContHigh 
2                        InflHigh                                   

Ratio of effect standard deviations: 2.736
Log(sd ratio):                 1.007   (se 0.264)

Approximate 95% confidence interval for log(sd ratio): (0.489,1.524)
Approximate 95% confidence interval for sd ratio:      (1.631,4.591)
Warning message:
In 1.96 * (object$se.log.ratio) * c(-1, 1) :
  Recycling array of length 1 in array-vector arithmetic is deprecated.
  Use c() or as.vector() instead.

relimp documentation built on May 2, 2019, 2:02 p.m.

Related to relimp in relimp...