# loglikelihood: computes the model log likelihood useful for estimation of... In bclust: Bayesian Hierarchical Clustering Using Spike and Slab Models

## Description

The function is useful for deriving the maximum likelihood estimates of the model parameters.

## Usage

 ```1 2``` ```loglikelihood(x.mean,x.css,repno,transformed.par, effect.family="gaussian",var.select=TRUE) ```

## Arguments

 `x.mean` The mean matrix of the clustering types from the meancss function. `x.css` The corrected sum of squares matrix of the clustering types from the meancss function. `repno` The vector containing the number of replications of each clustering type corresponding to the each row of `x.mean` and `x.css`, from the meancss function. `transformed.par` The vector of transformed model parameters that the data likelihood will be evaluated at. The transformation is the log for the variance parameters, the identity for the mean, and the logit for the proportions. The length of the vector depends on the chosen `effect.family` and `var.select`. `effect.family` Distribution family of the disappearing random components. Choices are "gaussian" or "alaplace" allowing Gaussian or asymmetric Laplace family, respectively. `var.select` A logical value, `TRUE` for fitting models that define spike-and-slab in variable level, thus allowing Bayesian variable selection.

## Details

Sometimes estimation of the model parameters is difficult, always check the convergence of the optimisation algorithm. The asymmetric Laplace model, `effect.family="alaplace"`, is often more difficult to optimise than `effect.family="gaussian"`.

If data are standardised (having general mean zero and general variance one) the log likelihood function is usually maximised over values between -5 and 5.

The `transformed.par` is a vector of transformed model parameters having length 5 up to 7 depending on the chosen model.

The `transformed.par` is log s2, log s2_h, log s2_t, m, logit p, logit q a vector of length 6 when using `effect.family = "gaussian"` and `var.select=TRUE`,

and is log s2, log s2_h, log s2_tL, log s2_tR, m, logit p, logit q a vector of length 7

for `effect.family="alaplace"` and `var.select=TRUE`.

When `var.select=FALSE` the q parameter is dropped, yielding a vector of length 5 for

`effect.family="gaussian"` and a vector of length 6

for `effect.family="alaplace"`.

We assumed a Bayesian linear model being

y_{vctr}=m+h_{vct}+d_{v}*g_{vc}*t_{vc}+e_{vctr}

where y_{vctr} is the available data on variable v, cluster(or class) c, type t, and replicate r; h_{vct} is the between-type error, t_{vc} is the disappearing random component controlled by the Bernoulli variables d_{v} with success probability q and g_{vc} with success probability p; and e_{vctr} is the between-replicate error. The types inside a cluster (or class) share the same t_{vc}, but may arise with a different h_{vct}.

The model parameters has natural interpretations, s2 is the between replicate error variance; s2_h is the variance of between-type error; s2_t is the variance of the disappearing random component which is decomposed to s2_tL, s2_tR the left and the right tail variances if the model is asymmetric Laplace; m is the general level; p is the proportion of active variable-cluster (or variable-class) combinations, and q is the proportion of the active variables. For more details see Vahid Partovi Nia and Anthony C. Davison (2012)

## References

Vahid Partovi Nia and Anthony C. Davison (2012). High-Dimensional Bayesian Clustering with Variable Selection: The R Package bclust. Journal of Statistical Software, 47(5), 1-22. URL http://www.jstatsoft.org/v47/i05/

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19``` ```data(gaelle) gaelle.id<-rep(1:14,c(3,rep(4,13))) # first 3 rows replication of ColWT, 4 for the rest mc.gaelle<-meancss(gaelle,gaelle.id) optimfunc<-function(theta) { -loglikelihood(x.mean=mc.gaelle\$mean,x.css=mc.gaelle\$css, repno=mc.gaelle\$repno,transformed.par=theta)#compute - log likelihood } transpar<-optim(rep(0,6),optimfunc,method="BFGS")\$par #gives argmin(-loglikelihood) #put a vector of correct length for the evaluation of the likelihood plot(bclust(gaelle,transformed.par=transpar)) ```