lda_multinomial: LDA with multinomial entry and Stick-Breaking prior.

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

This method implements the Latent Dirichlet Allocation with Stick-Breaking prior for multinomial data. rlda.multinomial works with frequency data.frame.

Usage

1
2
  rlda.multinomial(data, n_community, beta, gamma,
  n_gibbs, ll_prior = TRUE, display_progress = TRUE)

Arguments

data

A abundance data.frame where each row is a sampling unit (i.e. Plots, Locations, Time, etc.) and each column is a categorical type of element (i.e. Species, Firms, Issues, etc.).

n_community

Total number of communities to return. It must be less than the total number of columns inside the data data.frame.

beta

Hyperparameter associated with the Dirichlet Phi matrix.

gamma

Hyperparameter associated with the Stick-Breaking prior.

n_gibbs

Total number of Gibbs Samples.

ll_prior

boolean scalar, TRUE if the log-likelihood must be computed using also the priors or FALSE otherwise.

display_progress

boolean scalar, TRUE if the Progress Bar must be showed and FALSE otherwise.

Details

rlda.multinomial uses a modified Latent Dirichlet Allocation method to construct Mixed-Membership Clusters using Bayesian Inference. The data must be a non-empty data.frame with the frequencies for each variable (column) in each observation (row).

Value

A R List with three elements:

Theta

The individual probability for each observation (ex: location) belong in each cluster (ex: community). It is a matrix with dimension equal n_gibbs by nrow(data) * n_community

Phi

The individual probability for each variable (ex: Specie) belong in each cluster (ex: community). It is a matrix with dimension equal n_gibbs by ncol(data) * n_community

LogLikelihood

The vector of Log-Likelihoods compute for each Gibbs Sample.

Note

The Theta and Phi matrix can be obtained for the i-th gibbs sampling using matrix(Theta[i,], nrow = nrow(data), ncol = n_community) and matrix(Phi[i,], nrow = n_community, ncol = ncol(data)), respectively.

Author(s)

References

See Also

rlda.binomial, rlda.bernoulli

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
	## Not run: 
		# Invoke the library
		library(Rlda)
		# Read the Complaints data
		data(complaints)

		# Create the abundance matrix
		library(reshape2)
		mat1 <- dcast(complaints[, c("Company","Issue")],
					Company ~ Issue, fun.aggregate = length,
					value.var = "Issue")
		# Create the rowname
		rownames(mat1) <- mat1[, 1]
		# Remove the ID variable
		mat1 <- mat1[, -1]

		# Set seed
		set.seed(9292)
		# Hyperparameters for each prior distribution
		beta <- rep(1,ncol(mat1))
		gamma <- 0.01

		#Execute the LDA for the Multinomial entry
		res <- rlda.multinomial(data = mat1, n_community = 30,
		beta = beta, gamma = gamma, n_gibbs = 1000,
		ll_prior = TRUE, display_progress = TRUE)
	
## End(Not run)

Rlda documentation built on May 1, 2019, 7:26 p.m.

Related to lda_multinomial in Rlda...