calc.condlogLik: Conditional log-likelihood for a fitted model

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/calclogLfunctions.R

Description

Calculates the conditional log-likelihood for a set of parameter estimates from a fitted model, where everything is treated as "fixed effects" including latent variables, row effects, and so on.

Usage

1
2
3
calc.condlogLik(y, X = NULL, family, trial.size = 1, lv.coefs, 
	X.coefs = NULL, row.coefs = NULL, row.ids = NULL,
	offset = NULL, lv = NULL, cutoffs = NULL, powerparam = NULL)

Arguments

y

The response matrix the model was fitted to.

X

The model matrix used in the model. Defaults to NULL, in which case it is assumed no model matrix was used.

family

Either a single element, or a vector of length equal to the number of columns in y. The former assumes all columns of y come from this distribution. The latter option allows for different distributions for each column of y. Elements can be one of "binomial" (with probit link), "poisson" (with log link), "negative.binomial" (with log link), "normal" (with identity link), "lnormal" for lognormal (with log link), "tweedie" (with log link), "exponential" (with log link), "gamma" (with log link), "beta" (with logit link), "ordinal" (cumulative probit regression).

Please see about.distributions for information on distributions available in boral overall.

trial.size

Either equal to a single element, or a vector of length equal to the number of columns in y. If a single element, then all columns assumed to be binomially distributed will have trial size set to this. If a vector, different trial sizes are allowed in each column of y. The argument is ignored for all columns not assumed to be binomially distributed. Defaults to 1, i.e. Bernoulli distribution.

lv.coefs

The column-specific intercept, coefficient estimates relating to the latent variables, and dispersion parameters from the fitted model.

X.coefs

The coefficients estimates relating to X from the fitted model. Defaults to NULL, in which it is assumed there are no covariates in the model.

row.coefs

Row effect estimates for the fitted model. The conditional likelihood is defined conditional on these estimates i.e., they are also treated as “fixed effects". Defaults to NULL, in which case it is assumed there are no row effects in the model.

row.ids

A matrix with the number of rows equal to the number of rows in y, and the number of columns equal to the number of row effects to be included in the model. Element (i,j) indicates to the cluster ID of row i in y for random effect eqnj; please see the boral function for details. Defaults to NULL, so that if row.coefs = NULL then the argument is ignored, otherwise if row.coefs is supplied then row.ids = matrix(1:nrow(y), ncol = 1) i.e., a single, row effect unique to each row. An internal check is done to see row.coefs and row.ids are consistent in terms of arguments supplied.

offset

A matrix with the same dimensions as the response matrix y, specifying an a-priori known component to be included in the linear predictor during fitting. Defaults to NULL.

lv

Latent variables "estimates" from the fitted model, which the conditional likelihood is based on. Defaults to NULL, in which case it is assumed no latent variables were included in the model.

cutoffs

Common cutoff estimates from the fitted model when any of the columns of y are ordinal responses. Defaults to NULL.

powerparam

Common power parameter from the fitted model when any of the columns of y are tweedie responses. Defaults to NULL.

Details

For an n x p response matrix y, suppose we fit a model with one or more latent variables. If we denote the latent variables by \bm{z}_i; i = 1,…,n, then the conditional log-likelihood is given by,

\log(f) = ∑_{i=1}^n ∑_{j=1}^p \log \{f(y_{ij} | \bm{z}_i, \bm{θ}_j, β_{0j}, …)\},

where f(y_{ij}|\cdot) is the assumed distribution for column j, \bm{z}_i are the latent variables and \bm{θ}_j are the coefficients relating to them, β_{0j} are column-specific intercepts, and denotes anything else included in the model, such as row effects, regression coefficients related X and traits, etc...

The key difference between this and the marginal likelihood (see calc.marglogLik) is that the conditional likelihood treats everything as "fixed effects" i.e., conditions on them. These include the latent variables \bm{z}_i and other parameters that were included in the model as random effects e.g., row effects if row.eff = "random", regression coefficients related to X if traits were included in the model, and so on.

The conditional DIC, WAIC, EAIC, and EBIC returned from get.measures are based on the conditional likelihood calculated from this function. Additionally, get.measures returns the conditional likelihood evaluated at all MCMC samples of a fitted model.

Value

A list with the following components:

logLik

Value of the conditional log-likelihood.

logLik.comp

A matrix of the log-likelihood values for each element in y,
such that sum(logLik.comp) = logLik.

Author(s)

Francis K.C. Hui <fhui[email protected]>, with contributions from Wade Blanchard <[email protected]>

Maintainer: Francis Hui <[email protected]>

See Also

calc.logLik.lv0 to calculate the conditional/marginal log-likelihood for a model with no latent variables; calc.marglogLik for calculation of the marginal log-likelihood;

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
## Not run: 
## NOTE: The values below MUST NOT be used in a real application;
## they are only used here to make the examples run quick!!!
example_mcmc_control <- list(n.burnin = 10, n.iteration = 100, 
     n.thin = 1)

testpath <- file.path(tempdir(), "jagsboralmodel.txt")


library(mvabund) ## Load a dataset from the mvabund package
data(spider)
y <- spider$abun
n <- nrow(y)
p <- ncol(y)

## Example 1 - model with 2 latent variables, site effects, 
## 	and no environmental covariates
spiderfit_nb <- boral(y, family = "negative.binomial", 
    lv.control = list(num.lv = 2), row.eff = "fixed", 
    save.model = TRUE, mcmc.control = example_mcmc_control,
    model.name = testpath)

## Extract all MCMC samples
fit_mcmc <- get.mcmcsamples(spiderfit_nb) 
mcmc_names <- colnames(fit_mcmc)

## Find the posterior medians
coef_mat <- matrix(apply(fit_mcmc[,grep("lv.coefs",mcmc_names)],
    2,median),nrow=p)
site_coef <- list(ID1 = apply(fit_mcmc[,grep("row.coefs.ID1", mcmc_names)],
    2,median))
lvs_mat <- matrix(apply(fit_mcmc[,grep("lvs",mcmc_names)],2,median),nrow=n)

## Calculate the conditional log-likelihood at the posterior median
calc.condlogLik(y, family = "negative.binomial", 
    lv.coefs = coef_mat, row.coefs = site_coef, lv = lvs_mat)


## Example 2 - model with no latent variables and environmental covariates
X <- scale(spider$x)
spiderfit_nb2 <- boral(y, X = X, family = "negative.binomial", 
    save.model = TRUE, mcmc.control = example_mcmc_control, 
    model.name = testpath)

## Extract all MCMC samples
fit_mcmc <- get.mcmcsamples(spiderfit_nb2) 
mcmc_names <- colnames(fit_mcmc)

## Find the posterior medians
coef_mat <- matrix(apply(fit_mcmc[,grep("lv.coefs",mcmc_names)],
    2,median),nrow=p)
X_coef_mat <- matrix(apply(fit_mcmc[,grep("X.coefs",mcmc_names)],
    2,median),nrow=p)

## Calculate the log-likelihood at the posterior median
calc.condlogLik(y, X = X, family = "negative.binomial", 
    lv.coefs =  coef_mat, X.coefs = X_coef_mat)

## End(Not run)

boral documentation built on Jan. 29, 2020, 1:06 a.m.