# smartpred: Smart Prediction In VGAM: Vector Generalized Linear and Additive Models

## Description

Data-dependent parameters in formula terms can cause problems in when predicting. The smartpred package saves data-dependent parameters on the object so that the bug is fixed. The `lm` and `glm` functions have been fixed properly. Note that the VGAM package by T. W. Yee automatically comes with smart prediction.

## Usage

 ```1 2 3 4 5 6``` ```sm.bs(x, df = NULL, knots = NULL, degree = 3, intercept = FALSE, Boundary.knots = range(x)) sm.ns(x, df = NULL, knots = NULL, intercept = FALSE, Boundary.knots = range(x)) sm.poly(x, ..., degree = 1, coefs = NULL, raw = FALSE) sm.scale(x, center = TRUE, scale = TRUE) ```

## Arguments

 `x` The `x` argument is actually common to them all. `df, knots, intercept, Boundary.knots` See `bs` and/or `ns`. `degree, ..., coefs, raw` See `poly`. `center, scale` See `scale`.

## Details

R version 1.6.0 introduced a partial fix for the prediction problem because it does not work all the time, e.g., for terms such as `I(poly(x, 3))`, `poly(c(scale(x)), 3)`, `bs(scale(x), 3)`, `scale(scale(x))`. See the examples below. Smart prediction, however, will always work.

The basic idea is that the functions in the formula are now smart, and the modelling functions make use of these smart functions. Smart prediction works in two ways: using `smart.expression`, or using a combination of `put.smart` and `get.smart`.

## Value

The usual value returned by `bs`, `ns`, `poly` and `scale`, When used with functions such as `vglm` the data-dependent parameters are saved on one slot component called `smart.prediction`.

## Side Effects

The variables `.max.smart`, `.smart.prediction` and `.smart.prediction.counter` are created while the model is being fitted. They are created in a new environment called `smartpredenv`. These variables are deleted after the model has been fitted. However, if there is an error in the model fitting function or the fitting model is killed (e.g., by typing control-C) then these variables will be left in `smartpredenv`. At the beginning of model fitting, these variables are deleted if present in `smartpredenv`.

During prediction, the variables `.smart.prediction` and `.smart.prediction.counter` are reconstructed and read by the smart functions when the model frame is re-evaluated. After prediction, these variables are deleted.

If the modelling function is used with argument `smart = FALSE` (e.g., `vglm(..., smart = FALSE)`) then smart prediction will not be used, and the results should match with the original R functions.

## WARNING

The functions `bs`, `ns`, `poly` and `scale` are now left alone (from 2014-05 onwards) and no longer smart. They work via safe prediction. The smart versions of these functions have been renamed and they begin with `"sm."`.

The functions `predict.bs` and `predict.ns` are not smart. That is because they operate on objects that contain attributes only and do not have list components or slots. The function `predict.poly` is not smart.

## Author(s)

T. W. Yee and T. J. Hastie

`get.smart.prediction`, `get.smart`, `put.smart`, `smart.expression`, `smart.mode.is`, `setup.smart`, `wrapup.smart`. For `vgam` in VGAM, `sm.ps` is important. Commonly used data-dependent functions include `scale`, `poly`, `bs`, `ns`. In R, the functions `bs` and `ns` are in the splines package, and this library is automatically loaded in because it contains compiled code that `bs` and `ns` call.

The functions `vglm`, `vgam`, `rrvglm` and `cqo` in T. W. Yee's VGAM package are examples of modelling functions that employ smart prediction.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38``` ```# Create some data first n <- 20 set.seed(86) # For reproducibility of the random numbers ldata <- data.frame(x2 = sort(runif(n)), y = sort(runif(n))) library("splines") # To get ns() in R # This will work for R 1.6.0 and later fit <- lm(y ~ ns(x2, df = 5), data = ldata) ## Not run: plot(y ~ x2, data = ldata) lines(fitted(fit) ~ x2, data = ldata) new.ldata <- data.frame(x2 = seq(0, 1, len = n)) points(predict(fit, new.ldata) ~ x2, new.ldata, type = "b", col = 2, err = -1) ## End(Not run) # The following fails for R 1.6.x and later. It can be # made to work with smart prediction provided # ns is changed to sm.ns and scale is changed to sm.scale: fit1 <- lm(y ~ ns(scale(x2), df = 5), data = ldata) ## Not run: plot(y ~ x2, data = ldata, main = "Safe prediction fails") lines(fitted(fit1) ~ x2, data = ldata) points(predict(fit1, new.ldata) ~ x2, new.ldata, type = "b", col = 2, err = -1) ## End(Not run) # Fit the above using smart prediction ## Not run: library("VGAM") # The following requires the VGAM package to be loaded fit2 <- vglm(y ~ sm.ns(sm.scale(x2), df = 5), uninormal, data = ldata) fit2@smart.prediction plot(y ~ x2, data = ldata, main = "Smart prediction") lines(fitted(fit2) ~ x2, data = ldata) points(predict(fit2, new.ldata, type = "response") ~ x2, data = new.ldata, type = "b", col = 2, err = -1) ## End(Not run) ```

### Example output   ```Loading required package: stats4
[]
[]\$center
 0.5209627

[]\$scale
 0.2402085

[]\$match.call
sm.scale.default(x = x2)

[]
[]\$df
 5

[]\$knots
20%        40%        60%        80%
-1.0399370 -0.3261813  0.2775136  0.9359183

[]\$intercept
 FALSE

[]\$Boundary.knots
 -1.342555  1.587736

[]\$match.call
sm.ns(x = sm.scale(x2), df = 5)
```

VGAM documentation built on Jan. 16, 2021, 5:21 p.m.