# penReg: Function to fit penalised regression In gamlss.util: GAMLSS Utilities

## Description

The function `penReg()` can be used to fit a P-spline. It can be used as demonstration of how the penalised B-splines can be fitted to one explanatory variable. For more that one explanatory variables use the function `pb()` in gamlss. The function `penRegQ()` is similar to the function `penReg()` but it estimates the "random effect" sigmas using the Q-function (marginal likelihood). The Q-function estimation takes longer but it has the advantage that standard errors are provided for \log σ_e and \log σ_b, where the sigmas are the standard errors for the response and the random effects respectively. The function `pbq()` is a smoother within GAMLSS and should give identical results to the additive function `pb()`. The function `gamlss.pbq` is not for use.

## Usage

 ```1 2 3 4 5 6 7 8 9``` ```penReg(y, x, weights = rep(1, length(y)), df = NULL, lambda = NULL, start = 10, inter = 20, order = 2, degree = 3, plot = FALSE, method = c("ML", "ML-1", "GAIC", "GCV", "EM"), k = 2, ...) penRegQ(y, x, weights = rep(1, length(y)), order = 2, start = 10, plot = FALSE, lambda = NULL, inter = 20, degree = 3, optim.proc = c("nlminb", "optim"), optim.control = NULL) pbq(x, control = pbq.control(...), ...) gamlss.pbq(x, y, w, xeval = NULL, ...) ```

## Arguments

 `y` the response variable `x` the unique explanatory variable `weights` prior weights `w` weights in the iretation withing GAMLSS `df` effective degrees of freedom `lambda` the smoothing parameter `start` the lambda starting value if the local methods are used `inter` the no of break points (knots) in the x-axis `order` the required difference in the vector of coefficients `degree` the degree of the piecewise polynomial `plot` whether to plot the data and the fitted function `method` The method used in the (local) performance iterations. Available methods are "ML", "ML-1", "EM", "GAIC" and "GCV" `k` the penalty used in "GAIC" and "GCV" `optim.proc` which function to be use to optimise the Q-function, options are `c("nlminb", "optim")` `optim.control` options for the optimisation procedures `control` arguments for the fitting function. It takes one two: i) `order` the order of the B-spline and `plot` whether to plot the data and fit. `xeval` this is use for prediction `...` for extra arguments

## Value

Returns a fitted object of class `penReg`. The object contains 1) the fitted `coefficients` 2) the `fitted.values` 3) the response variable `y`, 4) the label of the response variable `ylabel` 5) the explanatory variable `x`, 6) the lebel of the explanatory variable 7) the smoothing parameter `lambda`, 8) the effective degrees of freedom `df`, 9) the estimete for sigma `sigma`, 10) the residual sum of squares `rss`, 11) the Akaike information criterion `aic`, 12) the Bayesian information criterion `sbc` and 13) the `deviance`

## Author(s)

Mikis Stasinopoulos [email protected], Bob Rigby [email protected] and Paul Eilers

## References

Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with comments and rejoinder). Statist. Sci, 11, 89-121.

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, http://www.jstatsoft.org/v23/i07.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63``` ```set.seed(1234) x <- seq(0,10,length=200); y<-(yt<-1+2*x+.6*x^2-.1*x^3)+rnorm(200, 4) library(gamlss) #------------------ # df fixed g1<-gamlss(y~pb(x, df=4)) m1<-penReg(y,x, df=4) cbind(g1\$mu.coefSmo[[1]]\$lambda, m1\$lambda) cbind(g1\$mu.df, m1\$edf) cbind(g1\$aic, m1\$aic) cbind(fitted(g1), fitted(m1))[1:10,] # identical #------------------ # estimate lambda using ML g2<-gamlss(y~pb(x)) m2<-penReg(y,x) cbind(g2\$mu.df, m2\$edf) cbind(g2\$mu.lambda, m2\$lambda) cbind(g2\$aic, m2\$aic) # different lambda cbind(fitted(g2), fitted(m2))[1:10,] # identical #------------------ # estimate lambda using GCV g3 <- gamlss(y~pb(x, method="GCV")) m3 <- penReg(y,x, method="GCV") cbind(g3\$mu.df, m3\$edf) cbind(g3\$mu.lambda, m3\$lambda) cbind(g3\$aic, m3\$aic) cbind(fitted(g3), fitted(m3))[1:10,] # almost identical #------------------ # estimate lambda using GAIC(#=3) g4<-gamlss(y~pb(x, method="GAIC", k=3)) m4<-penReg(y,x, method="GAIC", k=3) cbind(g4\$mu.df, m4\$edf ) cbind(g4\$mu.lambda, m4\$lambda) cbind(g4\$aic, m4\$aic) cbind(g4\$mu.df, m4\$df) cbind(g4\$mu.lambda, m4\$lambda) cbind(fitted(g4), fitted(m4))[1:10,] #------------------- plot(y~x) lines(fitted(m1)~x, col="green") lines(fitted(m2)~x, col="red") lines(fitted(m3)~x, col="blue") lines(fitted(m4)~x, col="yellow") lines(fitted(m4)~x, col="grey") # using the Q function # the Q-function takes longer system.time(g6<-gamlss(y~pbq(x))) system.time(g61<-gamlss(y~pb(x))) AIC(g6, g61) # system.time(m6<-penRegQ(y,x)) system.time(m61<-penReg(y,x)) AIC(m6, m61) cbind(g6\$mu.df, g61\$mu.df,m6\$edf, m61\$edf) cbind(g6\$mu.lambda,g61\$mu.lambda, m6\$lambda, m61\$lambda) cbind(g6\$aic, AIC(g6), m6\$aic, AIC(m6), m61\$aic, AIC(m61)) cbind(fitted(g6), fitted(m6))[1:10,] ```

### Example output

```Loading required package: gamlss.dist
**********   GAMLSS Version 5.0-2  **********
For more on GAMLSS look at http://www.gamlss.org/
Type gamlssNews() to see new features/changes/bug fixes.

Attaching package: 'zoo'

The following objects are masked from 'package:base':

as.Date, as.Date.numeric

GAMLSS-RS iteration 1: Global Deviance = 588.0129
GAMLSS-RS iteration 2: Global Deviance = 588.0127
[,1]     [,2]
[1,] 29.00139 32.12168
[,1] [,2]
[1,]    6    6
[,1]     [,2]
[1,] 602.0127 602.0127
[,1]     [,2]
[1,] 4.410963 4.410963
[2,] 4.550483 4.550483
[3,] 4.690062 4.690062
[4,] 4.829716 4.829716
[5,] 4.969463 4.969463
[6,] 5.109318 5.109318
[7,] 5.249299 5.249299
[8,] 5.389422 5.389422
[9,] 5.529704 5.529704
[10,] 5.670163 5.670163
GAMLSS-RS iteration 1: Global Deviance = 554.7323
GAMLSS-RS iteration 2: Global Deviance = 554.7323
[,1]    [,2]
[1,] 11.8014 11.8014
[,1]      [,2]
[1,] 0.9241998 0.8667171
[,1]     [,2]
[1,] 580.3351 580.3351
[,1]     [,2]
[1,] 4.522962 4.522961
[2,] 4.656743 4.656743
[3,] 4.791038 4.791038
[4,] 4.925999 4.925999
[5,] 5.061780 5.061780
[6,] 5.198533 5.198533
[7,] 5.336412 5.336412
[8,] 5.475569 5.475569
[9,] 5.616157 5.616157
[10,] 5.758269 5.758269
GAMLSS-RS iteration 1: Global Deviance = 564.2135
GAMLSS-RS iteration 2: Global Deviance = 564.2135
[,1]     [,2]
[1,] 8.316444 10.74549
[,1]     [,2]
[1,] 5.973202 1.473257
[,1]     [,2]
[1,] 582.8464 580.4273
[,1]     [,2]
[1,] 4.548077 4.526789
[2,] 4.680495 4.662723
[3,] 4.812954 4.798888
[4,] 4.945465 4.935355
[5,] 5.078042 5.072192
[6,] 5.210699 5.209470
[7,] 5.343449 5.347257
[8,] 5.476305 5.485623
[9,] 5.609281 5.624638
[10,] 5.742384 5.764334
GAMLSS-RS iteration 1: Global Deviance = 670.3158
GAMLSS-RS iteration 2: Global Deviance = 565.6209
GAMLSS-RS iteration 3: Global Deviance = 560.7827
GAMLSS-RS iteration 4: Global Deviance = 560.5387
GAMLSS-RS iteration 5: Global Deviance = 560.5263
GAMLSS-RS iteration 6: Global Deviance = 560.5259
[,1]     [,2]
[1,] 9.345836 9.345927
[,1]     [,2]
[1,] 3.27168 3.158211
[,1]     [,2]
[1,] 581.2175 581.2175
[,1]
[1,] 9.345836
[,1]     [,2]
[1,] 3.27168 3.158211
[,1]     [,2]
[1,] 4.544514 4.544513
[2,] 4.678921 4.678920
[3,] 4.813387 4.813386
[4,] 4.947932 4.947932
[5,] 5.082576 5.082576
[6,] 5.217339 5.217339
[7,] 5.352240 5.352241
[8,] 5.487299 5.487300
[9,] 5.622536 5.622537
[10,] 5.757956 5.757957
GAMLSS-RS iteration 1: Global Deviance = 554.7323
GAMLSS-RS iteration 2: Global Deviance = 554.7323
user  system elapsed
0.243   0.000   0.244
GAMLSS-RS iteration 1: Global Deviance = 554.7323
GAMLSS-RS iteration 2: Global Deviance = 554.7323
user  system elapsed
0.032   0.000   0.033
df      AIC
g6  12.8014 580.3351
g61 12.8014 580.3351
user  system elapsed
0.008   0.000   0.008
user  system elapsed
0.004   0.000   0.004
df      AIC
m61 12.8014 580.3351
m6  12.8014 580.6976
[,1]    [,2]    [,3]    [,4]
[1,] 11.8014 11.8014 11.8014 11.8014
[,1]      [,2]      [,3]      [,4]
[1,] 0.9241996 0.9241998 0.8667171 0.8667171
[,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,] 580.3351 580.3351 580.6976 580.6976 580.3351 580.3351
[,1]     [,2]
[1,] 4.522962 4.522961
[2,] 4.656743 4.656743
[3,] 4.791038 4.791038
[4,] 4.925999 4.925999
[5,] 5.061780 5.061780
[6,] 5.198533 5.198533
[7,] 5.336412 5.336412
[8,] 5.475569 5.475569
[9,] 5.616157 5.616157
[10,] 5.758269 5.758269
```

gamlss.util documentation built on May 2, 2019, 7:10 a.m.