Description Usage Arguments Details Value Author(s) See Also Examples
Uses flexible parametric additive models (see areg
and its
use of regression splines) to
determine how well each variable can be predicted from the remaining
variables. Variables are dropped in a stepwise fashion, removing the
most predictable variable at each step. The remaining variables are used
to predict. The process continues until no variable still in the list
of predictors can be predicted with an R^2 or adjusted R^2
of at least r2
or until dropping the variable with the highest
R^2 (adjusted or ordinary) would cause a variable that was dropped
earlier to no longer be predicted at least at the r2
level from
the now smaller list of predictors.
1 2 3 4 5 
formula 
a formula. Enclose a variable in 
data 
a data frame 
subset 
usual subsetting expression 
r2 
ordinary or adjusted R^2 cutoff for redundancy 
type 
specify 
nk 
number of knots to use for continuous variables. Use

tlinear 
set to 
allcat 
set to 
minfreq 
For a binary or categorical variable, there must be at
least two categories with at least 
iterms 
set to 
pc 
if 
pr 
set to 
... 
arguments to pass to 
x 
an object created by 
digits 
number of digits to which to round R^2 values when printing 
long 
set to 
A categorical variable is deemed
redundant if a linear combination of dummy variables representing it can
be predicted from a linear combination of other variables. For example,
if there were 4 cities in the data and each city's rainfall was also
present as a variable, with virtually the same rainfall reported for all
observations for a city, city would be redundant given rainfall (or
viceversa; the one declared redundant would be the first one in the
formula). If two cities had the same rainfall, city
might be
declared redundant even though tied cities might be deemed nonredundant
in another setting. To ensure that all categories may be predicted well
from other variables, use the allcat
option. To ignore
categories that are too infrequent or too frequent, set minfreq
to a nonzero integer. When the number of observations in the category
is below this number or the number of observations not in the category
is below this number, no attempt is made to predict observations being
in that category individually for the purpose of redundancy detection.
an object of class "redun"
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
areg
, dataframeReduce
,
transcan
, varclus
,
subselect::genetic
1 2 3 4 5 6 7 8 9 10 11 12  set.seed(1)
n < 100
x1 < runif(n)
x2 < runif(n)
x3 < x1 + x2 + runif(n)/10
x4 < x1 + x2 + x3 + runif(n)/10
x5 < factor(sample(c('a','b','c'),n,replace=TRUE))
x6 < 1*(x5=='a'  x5=='c')
redun(~x1+x2+x3+x4+x5+x6, r2=.8)
redun(~x1+x2+x3+x4+x5+x6, r2=.8, minfreq=40)
redun(~x1+x2+x3+x4+x5+x6, r2=.8, allcat=TRUE)
# x5 is no longer redundant but x6 is

Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2
Attaching package: 'Hmisc'
The following objects are masked from 'package:base':
format.pval, units
Redundancy Analysis
redun(formula = ~x1 + x2 + x3 + x4 + x5 + x6, r2 = 0.8)
n: 100 p: 6 nk: 3
Number of NAs: 0
Transformation of target variables forced to be linear
Rsquared cutoff: 0.8 Type: ordinary
R^2 with which each variable can be predicted from all other variables:
x1 x2 x3 x4 x5 x6
0.994 0.995 0.998 0.999 1.000 1.000
Rendundant variables:
x5 x4 x3
Predicted from variables:
x1 x2 x6
Variable Deleted R^2 R^2 after later deletions
1 x5 1.000 1 1
2 x4 0.999 0.997
3 x3 0.995
Redundancy Analysis
redun(formula = ~x1 + x2 + x3 + x4 + x5 + x6, r2 = 0.8, minfreq = 40)
n: 100 p: 4 nk: 3
Number of NAs: 0
Transformation of target variables forced to be linear
Minimum category frequency required for retention of a binary or
categorical variable: 40
Binary or categorical variables removed because of inadequate frequencies:
x5 x6
Rsquared cutoff: 0.8 Type: ordinary
R^2 with which each variable can be predicted from all other variables:
x1 x2 x3 x4
0.994 0.994 0.998 0.999
Rendundant variables:
x4 x3
Predicted from variables:
x1 x2
Variable Deleted R^2 R^2 after later deletions
1 x4 0.999 0.997
2 x3 0.995
Redundancy Analysis
redun(formula = ~x1 + x2 + x3 + x4 + x5 + x6, r2 = 0.8, allcat = TRUE)
n: 100 p: 6 nk: 3
Number of NAs: 0
Transformation of target variables forced to be linear
All levels of a categorical variable had to be redundant before the
variable was declared redundant
Rsquared cutoff: 0.8 Type: ordinary
R^2 with which each variable can be predicted from all other variables:
x1 x2 x3 x4 x5 x6
0.994 0.995 0.998 0.999 0.313 1.000
(For categorical variables the minimum R^2 for any sufficiently
frequent dummy variable is displayed)
Rendundant variables:
x6 x4 x3
Predicted from variables:
x1 x2 x5
Variable Deleted R^2 R^2 after later deletions
1 x6 1.000 1 1
2 x4 0.999 0.997
3 x3 0.995
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.