View source: R/family.basics.R
trim.constraints | R Documentation |
Deletes statistically nonsignficant regression coefficients via their constraint matrices, for future refitting.
trim.constraints(object, sig.level = 0.05, max.num = Inf,
intercepts = TRUE, ...)
object |
Some VGAM object, especially having
class |
sig.level |
Significance levels, with values in |
max.num |
Numeric, positive and integer-valued.
Maximum number of regression coefficients allowable for deletion.
This allows one to limit the number of deleted coefficients.
For example,
if |
intercepts |
Logical. Trim the intercept term?
If |
... |
Unused but for provision in the future. |
This utility function is intended to simplify an existing
vglm
object having
variables (terms) that affect unnecessary parameters.
Suppose the explanatory variables in the formula
includes a simple numeric covariate called x2
.
This variable will affect every linear predictor if
zero = NULL
in the VGAM family function.
This situation may correspond to the constraint matrices having
unnecessary columns because their regression coefficients are
statistically nonsignificant.
This function attempts to delete those columns and
return a possibly simplified list of constraint matrices
that can make refitting a simpler model easy to do.
P-values obtained from summaryvglm
(with HDEtest = FALSE
for increased speed)
are compared to sig.level
to test for
statistical significance.
For terms that generate more than one column of the
"lm"
model matrix,
such as bs
and poly
,
the column is deleted if all regression coefficients
are statistically nonsignificant.
Incidentally, users should instead use
sm.bs
,
sm.ns
,
sm.poly
,
etc.,
for smart and safe prediction.
One can think of this function as facilitating
backward elimination for variable selection,
especially if max.num = 1
and M=1
,
however usually more than one regression coefficient is deleted
here by default.
A list of possibly simpler constraint matrices
that can be fed back into the model using the
constraints
argument
(usually zero = NULL
is needed to avoid a warning).
Consequently, they are required to be of the "term"
-type.
After the model is refitted, applying
summaryvglm
should result in
regression coefficients that are ‘all’ statistically
significant.
This function has not been tested thoroughly.
One extreme is that a term is totally deleted because
none of its regression coefficients are needed,
and that situation has not yet been finalized.
Ideally, object
only contains terms where at least
one regression coefficient has a p-value less than
sig.level
.
For ordered factors and other situations, deleting
certain columns may not make sense and destroy interpretability.
As stated above, max.num
may not work properly
when there are terms that
generate more than one column of the LM model matrix.
However, this limitation may change in the future.
This function is experimental and may be replaced by some other function in the future. This function does not use S4 object oriented programming but may be converted to such in the future.
T. W. Yee
constraints
,
vglm
,
summaryvglm
,
model.matrixvlm
,
drop1.vglm
,
step4vglm
,
sm.bs
,
sm.ns
,
sm.poly
.
## Not run: data("xs.nz", package = "VGAMdata")
fit1 <-
vglm(cbind(worry, worrier) ~ bs(age) + sex + ethnicity + cat + dog,
binom2.or(zero = NULL), data = xs.nz, trace = TRUE)
summary(fit1, HDEtest = FALSE) # 'cat' is not significant at all
dim(constraints(fit1, matrix = TRUE))
(tclist1 <- trim.constraints(fit1)) # No 'cat'
fit2 <- # Delete 'cat' manually from the formula:
vglm(cbind(worry, worrier) ~ bs(age) + sex + ethnicity + dog,
binom2.or(zero = NULL), data = xs.nz,
constraints = tclist1, trace = TRUE)
summary(fit2, HDEtest = FALSE) # A simplified model
dim(constraints(fit2, matrix = TRUE)) # Fewer regression coefficients
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.