Description Usage Arguments Details Value Note Author(s) References See Also Examples
This method implements the Latent Dirichlet Allocation with
Stick-Breaking prior for binomial data.
rlda.binomial
works with frequency data.frame and also a
population data.frame.
1 2 | rlda.binomial(data, pop, n_community, alpha0, alpha1, gamma,
n_gibbs, ll_prior = TRUE, display_progress = TRUE)
|
data |
A abundance data.frame where each row is a sampling unit (i.e. Plots, Locations, Time, etc.) and each column is a categorical type of element (i.e. Species, Firms, Issues, etc.). |
pop |
A population data.frame where each row is a sampling unit
(i.e. Plots, Locations, Time, etc.) and each column is a categorical
type of element (i.e. Species, Firms, Issues, etc.). The elements inside
this data.frame must all be greater than the elements inside the |
n_community |
Total number of communities to return. It must be less than
the total number of columns inside the |
alpha0 |
Hyperparameter associated with the Beta prior Beta(alpha0, alpha1). |
alpha1 |
Hyperparameter associated with the Beta prior Beta(alpha0, alpha1). |
gamma |
Hyperparameter associated with the Stick-Breaking prior. |
n_gibbs |
Total number of Gibbs Samples. |
ll_prior |
boolean scalar, |
display_progress |
boolean scalar, |
rlda.binomial
uses a modified Latent Dirichlet Allocation method
to construct Mixed-Membership Clusters using Bayesian Inference.
The data
must be a non-empty data.frame with the frequencies for each variable
(column) in each observation (row). The pop
must be a non-empty data.frame with
the frequencies for each variable (column) in each observation (row) greater than the
entries inside data
data.frame.
A R List with three elements:
Theta |
The individual probability for each observation
(ex: location) belong in each cluster (ex: community). It is a matrix
with dimension equal |
Phi |
The individual probability for each variable
(ex: Specie) belong in each cluster (ex: community). It is a matrix
with dimension equal |
LogLikelihood |
The vector of Log-Likelihoods compute for each Gibbs Sample. |
The Theta
and Phi
matrix can be obtained for the i-th gibbs
sampling using matrix(Theta[i,], nrow = nrow(data), ncol = n_community)
and
matrix(Phi[i,], nrow = n_community, ncol = ncol(data))
, respectively.
Pedro Albuquerque.
pedroa@unb.br
http://pedrounb.blogspot.com/
Denis Valle.
drvalle@ufl.edu
http://denisvalle.weebly.com/
Daijiang Li.
daijianglee@gmail.com
http://daijiang.name
Blei, David M., Andrew Y. Ng, and Michael I. Jordan.
"Latent dirichlet allocation." Journal of machine Learning research
3.Jan (2003): 993-1022.
http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf
Valle, Denis, et al.
"Decomposing biodiversity data using the Latent Dirichlet
Allocation model, a probabilistic multivariate statistical
method." Ecology letters 17.12 (2014): 1591-1601.
rlda.multinomial
, rlda.bernoulli
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## Not run:
library(Rlda)
# Read the SP500 data
data(sp500)
# Create size
spSize <- as.data.frame(matrix(100,
ncol = ncol(sp500),
nrow = nrow(sp500)))
# Set seed
set.seed(5874)
# Hyperparameters for each prior distribution
gamma <- 0.01
alpha0 <- 0.01
alpha1 <- 0.01
# Execute the LDA for the Binomial entry
res <- rlda.binomial(data = sp500, pop = spSize, n_community = 10,
alpha0 = alpha0, alpha1 = alpha1, gamma = gamma,
n_gibbs = 500, ll_prior = TRUE, display_progress = TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.