blca.vb: Bayesian Latent Class Analysis via a variational Bayes...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/blca.vb.R

Description

Latent class analysis (LCA) attempts to find G hidden classes in binary data X. blca.vb uses a variational EM algorithm to find the distribution which best approximates the parameters' true distribution.

Usage

1
2
3
4
blca.vb(X, G, alpha = 1, beta = 1, delta = 1, 
	start.vals = c("single", "across"), counts.n = NULL, 
        iter = 500, restarts = 1, verbose = TRUE, conv = 1e-06, 
	small = 1e-100)

Arguments

X

The data matrix. This may take one of several forms, see data.blca.

G

The number of classes to run lca for.

alpha, beta

The prior values for the data conditional on group membership. These may take several forms: a single value, recycled across all groups and columns, a vector of length G or M (the number of columns in the data), or finally, a G x M matrix specifying each prior value separately. Defaults to 1, i.e, a uniform prior, for each value.

delta

Prior values for the mixture components in model. Defaults to 1, i.e., a uniform prior. May be single or vector valued (of length G).

start.vals

Denotes how class membership is to be assigned during the initial step of the algorithm. Two character values may be chosen, "single", which randomly assigns data points exclusively to one class, or "across", which assigns class membership via runif. Alternatively, class membership may be pre-specified, either as a vector of class membership, or as a matrix of probabilities. Defaults to "single".

counts.n

If data patterns have already been counted, a data matrix consisting of each unique data pattern can be supplied to the function, in addition to a vector counts.n, which supplies the corresponding number of times each pattern occurs in the data.

iter

The maximum number of iterations that the algorithm runs over. Will stop earlier if the algorithm converges.

restarts

restarts determines how many times the algorithm is run with different starting values. Parameter estimates from the run which achieved the highest log-posterior are returned. If starting values are supplied, these are used for the first run, after which random starting points are used. Defaults to 1.

verbose

Logical valued. If TRUE, the log-posterior from each run is printed.

conv

Convergence criteria, i.e., how small should the log-posterior increase become before the algorithm is deemed to have converged? Set relative to the size of the data matrix.

small

To ensure numerical stability a small constant is added to certain parameter estimates. Defaults to 1e-100.

Details

The variational Bayes method approximates the posterior using as a product of independent distributions. Parameters are then estimated for this approximate distribution using a variational EM algorithm. This method has a tendency to underestimate parameter's variance; as such the standard error and density estimates should be interpreted with caution.

While it is worth starting the algorithm from multiple starting points, variational algorithms have less of a tendency to cpnverge at saddle point or sub-optimal local maxima.

Value

A list of class "blca.vb" is returned, containing:

call

The initial call passed to the function.

itemprob

The item probabilities, conditional on class membership.

classprob

The class probabilities.

itemprob.sd

Posterior standard deviation estimates of the item probabilities.

classprob.sd

Posterior standard deviation estimates of the class probabilities.

parameters

A list containing posterior parameter values for item and class probabilities, which are assumed to follow beta and Dirichlet distributions respectively.

Z

Estimate of class membership for each unique datapoint.

LB

The lower bound estimate of the log-posterior of the estimated model.

lbstore

The value of the lower bound estimate for each iteration.

iter

The number of iterations required before convergence.

eps

The amount that the lower bound increased at the final iteration of the algorithm's run.

counts

The number of times each unique datapoint point occured.

prior

A list containing the prior values specified for the model.

Note

Variational Bayes approximations, are known to often underestimate the standard errors of the parameters under investigation, so caution is advised when checking their values.

Earlier versions of this function erroneously referred to posterior standard deviations as standard errors. This also extended to arguments supplied to and returned by the function, some of which are now returned with the corrected corrected suffix blca.em.sd (for standard deviation). For backwards compatability reasons, the earlier suffix .se has been retained as a returned argument.

Author(s)

Arthur White

References

Ormerod J, Wand M (2010). “Explaining Variational Approximations.” The American Statistician, 64(2), 140-153.

See Also

blca.em, blca.gibbs

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
type1 <- c(0.8, 0.8, 0.2, 0.2)
type2 <- c(0.2, 0.2, 0.8, 0.8)
x<- rlca(1000, rbind(type1,type2), c(0.6,0.4))

fit <- blca.vb(x, 2)
print(fit)
summary(fit)
par(mfrow=c(3,2))
plot(fit)
par(mfrow=c(1,1))

data(Alzheimer)
sj <- blca.vb(Alzheimer, 10, delta=1/10)
sj$classprob    ##Empty Groups

BayesLCA documentation built on July 2, 2020, 12:11 a.m.