Description Usage Arguments Details Value References See Also Examples
All-subsets regression for linear models estimated by ordinary least squares (OLS).
1 2 3 4 5 6 7 8 9 | lmSubsets(formula, ...)
## Default S3 method:
lmSubsets(formula, data, subset, weights, na.action,
model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, offset, ...)
lmSubsets_fit(x, y, weights = NULL, offset = NULL,
include = NULL, exclude = NULL, nmin = NULL, nmax = NULL,
tolerance = 0, pradius = NULL, nbest = 1, ..., .algo = "phbba")
|
formula, data, subset, weights, na.action, model, contrasts,
offset |
Standard formula interface. |
x, y |
The model matrix and response. |
include, exclude |
Force regressors in or out. |
nmin, nmax |
Minimum and maximum number of regressors. |
tolerance |
Vector of tolerances. |
pradius |
Preordering radius. |
nbest |
Number of best subsets. |
... |
Ignored. |
.algo |
Internal use. |
The generic lmSubsets
computes all-variable-subsets regression
for ordinary linear models. It provides various methods to
conveniently specify the regressor and response variables. The
standard formula
interface (see lm
) can be
used, or the information can be extracted from an already fitted
lm
object. The regressor matrix and response variable can also
be passed in directly.
The method computes the nbest
best subset models for every
subset size, where the "best" models are the models with the lowest
residual sum of squares (RSS). The scope of the search can be limited
to certain subset sizes by setting nmin
and nmax
. A
tolerance vector (expanded if necessary) may be specified to speed up
the algorithm, where tolerance[n]
is the tolerance applied to
subset models of size n
.
By way of include
and exclude
, variables may be forced
into or out of the regression, respectively.
The function will preorder the variables to reduce execution time if
pradius > 0
. Good execution times are usually attained for
approximately pradius = n/3
(default value), where n
is
the number of regressors after evaluation include
and
exclude
.
A set of standard extractor functions for fitted model objects is
available for objects of class "lmSubsets"
. See
methods
for more details.
An object of class "lmSubsets"
, i.e. a list with the
following components:
nobs |
Number of observations. |
nvar |
Number of variables. |
weights |
Weights vector. |
offset |
Offset component. |
intercept |
|
include |
Included regressors. |
exclude |
Excluded regressors. |
nmin, nmax |
Minimum and maximum subset sizes. |
tolerance |
Tolerance vector. |
nbest |
Number of best subsets. |
df |
Degrees of freedom. |
rss |
Residual sum of squares. |
which |
Selected variables. |
Hofmann M, Gatu C, Kontoghiorghes EJ (2007). Efficient Algorithms for Computing the Best Subset Regression Models for Large-Scale Problems. Computational Statistics \& Data Analysis, 52, 16–29.
Gatu C, Kontoghiorghes EJ (2006). Branch-and-Bound Algorithms for Computing the Best Subset Regression Models. Journal of Computational and Graphical Statistics, 15, 139–156.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## load data (with logs for relative potentials)
data("AirPollution", package = "lmSubsets")
#################
## basic usage ##
#################
## canonical example: fit all subsets
all.AirPoll <- lmSubsets(mortality ~ ., data = AirPollution, nbest = 10)
## visualize RSS
plot(all.AirPoll)
## summarize
summary(all.AirPoll)
## forced inclusion/exclusion of variables
all_2.AirPoll <- lmSubsets(all.AirPoll, include = "noncauc",
exclude = "whitecollar")
summary(all_2.AirPoll)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.