gamlssML: Maximum Likelihood estimation of a simple GAMLSS model
In gamlss: Generalized Additive Models for Location Scale and Shape

gamlssML

R Documentation

Maximum Likelihood estimation of a simple GAMLSS model

Description

The function gamlssML() fits a gamlss.family distribution to single data set using a non linear maximisation algorithm in R. This is relevant only when explanatory variables do not exist.

The function gamlssMLpred() is similar to gamlssML() but it saves the predictive global deviance for the newdata. The new data in gamlssMLpred() can be given with the arguments newdata or defining the factor rand. rand should be a binary factor rand splitting the original data set into a training set (value 1) and a validation/test set (values 2), see also gamlssVGD

Usage

gamlssML(formula, family = NO, weights = NULL, mu.start = NULL, 
 sigma.start = NULL, nu.start = NULL, tau.start = NULL, 
 mu.fix = FALSE, sigma.fix = FALSE, nu.fix = FALSE, 
 tau.fix = FALSE, data, start.from = NULL, ...)

gamlssMLpred(response = NULL, data = NULL, family = NO, 
 rand = NULL, newdata = NULL, ...)

Arguments

`formula`, `response`	a vector of data requiring the fit of a `gamlss.family` distribution or (only for the function `gamlssML`) a formula, for example, `y~1`, with no explanatory variables because they are ignored).
`family`	`gamlss.family` object, which is used to define the distribution and the link functions of the various parameters. The distribution families supported by `gamlssML()` can be found in `gamlss.family`
`weights`	a vector of weights. Here weights can be used to weight out observations (like in `subset`) or for a weighted likelihood analysis where the contribution of the observations to the likelihood differs according to `weights`. The length of `weights` must be the same as the number of observations in the data. By default, the weight is set to one. To set weights to vector say `w` use `weights=w`
`mu.start`	a scalar of initial values for the location parameter `mu` e.g. `mu.start=4`
`sigma.start`	a scalar of initial values for the scale parameter `sigma` e.g. `sigma.start=1`
`nu.start`	scalar of initial values for the parameter `nu` e.g. `nu.start=3`
`tau.start`	scalar of initial values for the parameter `tau` e.g. `tau.start=3`
`mu.fix`	whether the mu parameter should be kept fixed in the fitting processes e.g. `mu.fix=FALSE`
`sigma.fix`	whether the sigma parameter should be kept fixed in the fitting processes e.g. `sigma.fix=FALSE`
`nu.fix`	whether the nu parameter should be kept fixed in the fitting processes e.g. `nu.fix=FALSE`
`tau.fix`	whether the tau parameter should be kept fixed in the fitting processes e.g. `tau.fix=FALSE`
`data`	a data frame containing the variable `y`, e.g. `data=aids`. If this is missing, the variable should be on the search list.
`start.from`	a gamlss object to start from the fitting or vector of length as many parameters in the distribution
`rand`	For `gamlssMLpred()` a factor with values 1 (for fitting) and 2 (for predicting).
`newdata`	The prediction data set (validation or test).
`...`	for extra arguments

Details

The function gamlssML() fits a gamlss.family distribution to a single data set is using a non linear maximisation. in fact it uses the internal function MLE() which is a copy of the mle() function of package stat4. The function gamlssML() could be for large data faster than the equivalent gamlss() function which is designed for regression type of models.

The function gamlssMLpred() uses the function gamlssML() to fit the model but then uses predict.gamlssML() to predict for new data and saves the the prediction i) deviance increments, ii) global deviance iii) residuals.

Value

Returns a gamlssML object which behaves like a gamlss fitted objected

Author(s)

Mikis Stasinopoulos, Bob Rigby, Vlasis Voudouris and Majid Djennad

References

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

(see also https://www.gamlss.com/).

Examples

#-------- negative binomial 1000 observations
y<- rNBI(1000)
  system.time(m1<-gamlss(y~1, family=NBI))
  system.time(m1a<-gamlss(y~1, family=NBI, trace=FALSE))
system.time(m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# neg. binomial   n=10000
 y<- rNBI(10000)
 system.time(m1<-gamlss(y~1, family=NBI))
 system.time(m1a<-gamlss(y~1, family=NBI, trace=FALSE))
system.time(m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# binomial type data 
data(aep)
m1 <- gamlssML(aep$y, family=BB) # ok
m2 <- gamlssML(y, data=aep, family=BB) # ok
m3 <- gamlssML(y~1, data=aep, family=BB) # ok 
m4 <- gamlssML(aep$y~1, family=BB) # ok
AIC(m1,m2,m3,m4)
## Not run: 
#-----------------------------------------------------------
# neg. binomial   n=10000
y<- rNBI(10000)
rand <- sample(2, length(y), replace=TRUE, prob=c(0.6,0.4))
table(rand)
   Y <- subset(y, rand==1)
YVal <- subset(y, rand==2) 
length(Y)
length(YVal) 
da1 <- data.frame(y=y)
dim(da1)
da2 <- data.frame(y=Y)
dim(da2)
danew <- data.frame(y=YVal)
# using gamlssVGD to fit the models
g1 <- gamlssVGD(y~1, rand=rand, family=NBI, data=da1)
g2 <- gamlssVGD(y~1, family=NBI, data=da2, newdata=dan)
AIC(g1,g2)
VGD(g1,g2)
# using gamlssMLpred to fit the models
p1 <- gamlssMLpred(y, rand=rand, family=NBI)
p2 <- gamlssMLpred(Y, family=NBI, newdata=YVal)
# AIC and VGD should produce identical results
AIC(p1,p2,g1,g2)
VGD(p1,p2, g1,g2)
# the fitted residuals
wp(p1, ylim.all=1)
# the prediction residuals 
wp(resid=p1$residVal, ylim.all=.5)
#-------------------------------------------------------------
# chossing between distributions
p2<-gamlssMLpred(y, rand=rand, family=PO)
p3<-gamlssMLpred(y, rand=rand, family=PIG)
p4<-gamlssMLpred(y, rand=rand, family=BNB)
AIC(p1, p2, p3, p4)
VGD(p1, p2, p3, p4)
#--------------------------------------------------

## End(Not run)

gamlss documentation built on Aug. 21, 2025, 5:56 p.m.