Description Usage Arguments Details Value Note Author(s) References See Also Examples
This method implements the Latent Dirichlet Allocation with
Stick-Breaking prior for multinomial data.
rlda.multinomial
works with frequency data.frame.
1 2 |
data |
A abundance data.frame where each row is a sampling unit (i.e. Plots, Locations, Time, etc.) and each column is a categorical type of element (i.e. Species, Firms, Issues, etc.). |
n_community |
Total number of communities to return. It must be less than
the total number of columns inside the |
beta |
Hyperparameter associated with the Dirichlet |
gamma |
Hyperparameter associated with the Stick-Breaking prior. |
n_gibbs |
Total number of Gibbs Samples. |
ll_prior |
boolean scalar, |
display_progress |
boolean scalar, |
rlda.multinomial
uses a modified Latent Dirichlet Allocation method
to construct Mixed-Membership Clusters using Bayesian Inference.
The data
must be a non-empty data.frame with the frequencies for each variable
(column) in each observation (row).
A R List with three elements:
Theta |
The individual probability for each observation
(ex: location) belong in each cluster (ex: community). It is a matrix
with dimension equal |
Phi |
The individual probability for each variable
(ex: Specie) belong in each cluster (ex: community). It is a matrix
with dimension equal |
LogLikelihood |
The vector of Log-Likelihoods compute for each Gibbs Sample. |
The Theta
and Phi
matrix can be obtained for the i-th gibbs
sampling using matrix(Theta[i,], nrow = nrow(data), ncol = n_community)
and
matrix(Phi[i,], nrow = n_community, ncol = ncol(data))
, respectively.
Pedro Albuquerque.
pedroa@unb.br
http://pedrounb.blogspot.com/
Denis Valle.
drvalle@ufl.edu
http://denisvalle.weebly.com/
Daijiang Li.
daijianglee@gmail.com
Blei, David M., Andrew Y. Ng, and Michael I. Jordan.
"Latent dirichlet allocation." Journal of machine Learning research
3.Jan (2003): 993-1022.
http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf
Valle, Denis, et al.
"Decomposing biodiversity data using the Latent Dirichlet
Allocation model, a probabilistic multivariate statistical
method." Ecology letters 17.12 (2014): 1591-1601.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ## Not run:
# Invoke the library
library(Rlda)
# Read the Complaints data
data(complaints)
# Create the abundance matrix
library(reshape2)
mat1 <- dcast(complaints[, c("Company","Issue")],
Company ~ Issue, fun.aggregate = length,
value.var = "Issue")
# Create the rowname
rownames(mat1) <- mat1[, 1]
# Remove the ID variable
mat1 <- mat1[, -1]
# Set seed
set.seed(9292)
# Hyperparameters for each prior distribution
beta <- rep(1,ncol(mat1))
gamma <- 0.01
#Execute the LDA for the Multinomial entry
res <- rlda.multinomial(data = mat1, n_community = 30,
beta = beta, gamma = gamma, n_gibbs = 1000,
ll_prior = TRUE, display_progress = TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.