Package to calculate relative importance metrics for linear models
relaimpo calculates several relative importance metrics for the linear model.
The recommended metrics are
lmg (R^2 partitioned by averaging over orders, like in Lindemann, Merenda and Gold (1980, p.119ff))
pmvd (a newly proposed metric by Feldman (2005), non-US version only).
For completeness, several other metrics are also on offer. Other packages with related topics:
relaimpo calculates the metrics and also offers the possibility of bootstrapping them and of displaying results in print and graphically.
It is possible to designate a subset of variables as adjustment variables that always stay in the model so that relative importance is only assessed among the remaining variables.
Models can have up to 2-way interactions that are treated hierarchically - i.e. an interaction is only allowed in a model that also contains all its main effects.
In case of interactions, only metric
lmg can be used.
Observations with missing values are by default excluded from the analysis for most functions.
mianalyze.relimp allows to draw conclusions from a set of multiply imputed data sets.
This function is currently more restrictive than the rest of the package in terms of data types and models
that can be used (when summarizing the multiply imputed samples without calculating confidence intervals,
all possibilities available elsewhere are also applicable in
relaimpo does accomodate complex survey designs by making use of the facilities in package survey. Currently, interactions and calculated variables cannot be combined with using a complex survey design in bootstrapping functions.
This package uses as an internal function the function
nchoosek from vsn, authored by Wolfgang Huber, available under LGPL.
Furthermore, it uses a modified version of the function carscore from care by Verena Zuber and Korbinian Strimmer.
pmvd are computer-intensive. Although they are calculated based on the
covariance matrix, which saves substantial computing time in comparison to carrying out actual regressions,
these methods still take quite long for problems with many regressors. Obviously,
this is particularly relevant in combination with bootstrapping.
There are two versions of this package. The version on CRAN is globally licensed under GPL version 2 (or later).
There is an extended version with the interesting additional metric
pmvd that is licensed according to GPL version 2
under the geographical restriction "outside of the US" because of potential issues with US patent 6,640,204. This version can be obtained
from Ulrike Groempings website (cf. references section). Whenever you load the package, a display tells you, which version you are loading.
Ulrike Groemping, BHT Berlin
Chevan, A. and Sutherland, M. (1991) Hierarchical Partitioning. The American Statistician 45, 90–96.
Darlington, R.B. (1968) Multiple regression in psychological research and practice. Psychological Bulletin 69, 161–182.
Feldman, B. (2005) Relative Importance and Value. Manuscript (Version 1.1, March 19 2005), downloadable at http://www.prismanalytics.com/docs/RelativeImportance050319.pdf
Genizi, A. (1993) Decomposition of R2 in multiple regression with correlated regressors. Statistica Sinica 3, 407–420. Downloadable at http://www3.stat.sinica.edu.tw/statistica/password.asp?vol=3&num=2&art=10
Groemping, U. (2006) Relative Importance for Linear Regression in R: The Package relaimpo Journal of Statistical Software 17, Issue 1. Downloadable at http://www.jstatsoft.org/v17/i01
Lindeman, R.H., Merenda, P.F. and Gold, R.Z. (1980) Introduction to Bivariate and Multivariate Analysis, Glenview IL: Scott, Foresman.
Zuber, V. and Strimmer, K. (2010) Variable importance and model selection by decorrelation. Preprint, downloadable at http://www.uni-leipzig.de/strimmer/lab/publications/preprints/carscore2010.pdf
Go to http://prof.beuth-hochschule.de/groemping/relaimpo/ for further information and references.
classesmethods.relaimpo, package hier.part, package survey