gamlssML: Maximum Likelihood estimation of a simple GAMLSS model

View source: R/gamlssML.R

gamlssMLR Documentation

Maximum Likelihood estimation of a simple GAMLSS model

Description

The function gamlssML() fits a gamlss.family distribution to single data set using a non linear maximisation algorithm in R. This is relevant only when explanatory variables do not exist.

The function gamlssMLpred() is similar to gamlssML() but it saves the predictive global deviance for the newdata. The new data in gamlssMLpred() can be given with the arguments newdata or defining the factor rand. rand should be a binary factor rand splitting the original data set into a training set (value 1) and a validation/test set (values 2), see also gamlssVGD

Usage

gamlssML(formula, family = NO, weights = NULL, mu.start = NULL, 
 sigma.start = NULL, nu.start = NULL, tau.start = NULL, 
 mu.fix = FALSE, sigma.fix = FALSE, nu.fix = FALSE, 
 tau.fix = FALSE, data, start.from = NULL, ...)

gamlssMLpred(response = NULL, data = NULL, family = NO, 
 rand = NULL, newdata = NULL, ...) 

Arguments

formula, response

a vector of data requiring the fit of a gamlss.family distribution or (only for the function gamlssML) a formula, for example, y~1, with no explanatory variables because they are ignored).

family

gamlss.family object, which is used to define the distribution and the link functions of the various parameters. The distribution families supported by gamlssML() can be found in gamlss.family

weights

a vector of weights. Here weights can be used to weight out observations (like in subset) or for a weighted likelihood analysis where the contribution of the observations to the likelihood differs according to weights. The length of weights must be the same as the number of observations in the data. By default, the weight is set to one. To set weights to vector say w use weights=w

mu.start

a scalar of initial values for the location parameter mu e.g. mu.start=4

sigma.start

a scalar of initial values for the scale parameter sigma e.g. sigma.start=1

nu.start

scalar of initial values for the parameter nu e.g. nu.start=3

tau.start

scalar of initial values for the parameter tau e.g. tau.start=3

mu.fix

whether the mu parameter should be kept fixed in the fitting processes e.g. mu.fix=FALSE

sigma.fix

whether the sigma parameter should be kept fixed in the fitting processes e.g. sigma.fix=FALSE

nu.fix

whether the nu parameter should be kept fixed in the fitting processes e.g. nu.fix=FALSE

tau.fix

whether the tau parameter should be kept fixed in the fitting processes e.g. tau.fix=FALSE

data

a data frame containing the variable y, e.g. data=aids. If this is missing, the variable should be on the search list.

start.from

a gamlss object to start from the fitting or vector of length as many parameters in the distribution

rand

For gamlssMLpred() a factor with values 1 (for fitting) and 2 (for predicting).

newdata

The prediction data set (validation or test).

...

for extra arguments

Details

The function gamlssML() fits a gamlss.family distribution to a single data set is using a non linear maximisation. in fact it uses the internal function MLE() which is a copy of the mle() function of package stat4. The function gamlssML() could be for large data faster than the equivalent gamlss() function which is designed for regression type of models.

The function gamlssMLpred() uses the function gamlssML() to fit the model but then uses predict.gamlssML() to predict for new data and saves the the prediction i) deviance increments, ii) global deviance iii) residuals.

Value

Returns a gamlssML object which behaves like a gamlss fitted objected

Author(s)

Mikis Stasinopoulos, Bob Rigby, Vlasis Voudouris and Majid Djennad

References

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

(see also https://www.gamlss.com/).

See Also

gamlss.family, gamlss

Examples

#-------- negative binomial 1000 observations
y<- rNBI(1000)
  system.time(m1<-gamlss(y~1, family=NBI))
  system.time(m1a<-gamlss(y~1, family=NBI, trace=FALSE))
system.time(m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# neg. binomial   n=10000
 y<- rNBI(10000)
 system.time(m1<-gamlss(y~1, family=NBI))
 system.time(m1a<-gamlss(y~1, family=NBI, trace=FALSE))
system.time(m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# binomial type data 
data(aep)
m1 <- gamlssML(aep$y, family=BB) # ok
m2 <- gamlssML(y, data=aep, family=BB) # ok
m3 <- gamlssML(y~1, data=aep, family=BB) # ok 
m4 <- gamlssML(aep$y~1, family=BB) # ok
AIC(m1,m2,m3,m4)
## Not run: 
#-----------------------------------------------------------
# neg. binomial   n=10000
y<- rNBI(10000)
rand <- sample(2, length(y), replace=TRUE, prob=c(0.6,0.4))
table(rand)
   Y <- subset(y, rand==1)
YVal <- subset(y, rand==2) 
length(Y)
length(YVal) 
da1 <- data.frame(y=y)
dim(da1)
da2 <- data.frame(y=Y)
dim(da2)
danew <- data.frame(y=YVal)
# using gamlssVGD to fit the models
g1 <- gamlssVGD(y~1, rand=rand, family=NBI, data=da1)
g2 <- gamlssVGD(y~1, family=NBI, data=da2, newdata=dan)
AIC(g1,g2)
VGD(g1,g2)
# using gamlssMLpred to fit the models
p1 <- gamlssMLpred(y, rand=rand, family=NBI)
p2 <- gamlssMLpred(Y, family=NBI, newdata=YVal)
# AIC and VGD should produce identical results
AIC(p1,p2,g1,g2)
VGD(p1,p2, g1,g2)
# the fitted residuals
wp(p1, ylim.all=1)
# the prediction residuals 
wp(resid=p1$residVal, ylim.all=.5)
#-------------------------------------------------------------
# chossing between distributions
p2<-gamlssMLpred(y, rand=rand, family=PO)
p3<-gamlssMLpred(y, rand=rand, family=PIG)
p4<-gamlssMLpred(y, rand=rand, family=BNB)
AIC(p1, p2, p3, p4)
VGD(p1, p2, p3, p4)
#--------------------------------------------------

## End(Not run)
 

gamlss documentation built on May 29, 2024, 6:08 a.m.