Data-dependent parameters in formula terms
can cause problems in when predicting.
The smartpred package
data-dependent parameters on the object so that the bug is fixed.
glm functions have
been fixed properly. Note that the VGAM package by T. W. Yee
automatically comes with smart prediction.
1 2 3 4 5 6
R version 1.6.0 introduced a partial fix for the prediction
problem because it does not work all the time,
e.g., for terms such as
See the examples below.
Smart prediction, however, will always work.
The basic idea is that the functions in the formula are now smart, and the
modelling functions make use of these smart functions. Smart prediction
works in two ways: using
smart.expression, or using a
The usual value returned by
When used with functions such as
the data-dependent parameters are saved on one slot component called
are created while the model is being fitted.
They are created in a new environment called
These variables are deleted after the model has been fitted.
if there is an error in the model fitting function or the fitting
model is killed (e.g., by typing control-C) then these variables will
be left in
smartpredenv. At the beginning of model fitting,
these variables are deleted if present in
During prediction, the variables
are reconstructed and read by the smart functions when the model
frame is re-evaluated.
After prediction, these variables are deleted.
If the modelling function is used with argument
smart = FALSE
vglm(..., smart = FALSE)) then smart prediction will not
be used, and the results should match with the original R functions.
are now left alone (from 2014-05 onwards) and no longer smart.
They work via safe prediction.
The smart versions of these functions have been renamed and
they begin with
are not smart.
That is because they operate on objects that contain attributes only
and do not have list components or slots.
predict.poly is not smart.
T. W. Yee and T. J. Hastie
vgam in VGAM,
sm.ps is important.
Commonly used data-dependent functions include
ns are in the
splines package, and this library is automatically
loaded in because it contains compiled code that
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
# Create some data first n <- 20 set.seed(86) # For reproducibility of the random numbers ldata <- data.frame(x2 = sort(runif(n)), y = sort(runif(n))) library("splines") # To get ns() in R # This will work for R 1.6.0 and later fit <- lm(y ~ ns(x2, df = 5), data = ldata) ## Not run: plot(y ~ x2, data = ldata) lines(fitted(fit) ~ x2, data = ldata) new.ldata <- data.frame(x2 = seq(0, 1, len = n)) points(predict(fit, new.ldata) ~ x2, new.ldata, type = "b", col = 2, err = -1) ## End(Not run) # The following fails for R 1.6.x and later. It can be # made to work with smart prediction provided # ns is changed to sm.ns and scale is changed to sm.scale: fit1 <- lm(y ~ ns(scale(x2), df = 5), data = ldata) ## Not run: plot(y ~ x2, data = ldata, main = "Safe prediction fails") lines(fitted(fit1) ~ x2, data = ldata) points(predict(fit1, new.ldata) ~ x2, new.ldata, type = "b", col = 2, err = -1) ## End(Not run) # Fit the above using smart prediction ## Not run: library("VGAM") # The following requires the VGAM package to be loaded fit2 <- vglm(y ~ sm.ns(sm.scale(x2), df = 5), uninormal, data = ldata) firstname.lastname@example.org plot(y ~ x2, data = ldata, main = "Smart prediction") lines(fitted(fit2) ~ x2, data = ldata) points(predict(fit2, new.ldata, type = "response") ~ x2, data = new.ldata, type = "b", col = 2, err = -1) ## End(Not run)
Loading required package: stats4 Loading required package: splines [] []$center  0.5209627 []$scale  0.2402085 []$match.call sm.scale.default(x = x2) [] []$df  5 []$knots 20% 40% 60% 80% -1.0399370 -0.3261813 0.2775136 0.9359183 []$intercept  FALSE []$Boundary.knots  -1.342555 1.587736 []$match.call sm.ns(x = sm.scale(x2), df = 5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.