Description Usage Arguments Details Value
Use a Dirichlet Mixture Model on data to get cluster labels and cluster parameter values.
1 2 3 | ## S3 method for class 'RModel'
dmm.cluster(model, Xdata, alpha = 1, m_prior = 3,
m_post = 3, iters = 5000, burnin = 200, shuffled = TRUE)
|
model |
An object returned by |
Xdata |
A 1D array of length N (univariate case) or 2D array of size N-by-d (mulitvariate case), where d is the dimensionailty of the data and N is the number of observations. Use a Dirichlet Mixture Model on data to get cluster labels and cluster parameter values. |
alpha |
A float. The concentration parameter. Default is 1.0. |
m_prior |
An integer. Optionally paramter only used in non-conjugate case. Default is 3. |
m_post |
An integer. Optionally paramter only used in non-conjugate case. Default is 3. |
iters |
An integer. Number of iterations. Default is 5000. |
burnin |
An integer. Amount of burn-in. Default is 200. |
shuffled |
A logical. Whether or not to shuffled the data. Default is true. |
model |
An object returned by |
Xdata |
A 1D array of length N (univariate case) or 2D array of size N-by-d (mulitvariate case), where d is the dimensionailty of the data and N is the number of observations. |
Performs iters
iterations of Algorithm 2 (in conjugate case) or Algorithm 8 (in non-conjugate case) from Neal(2000) to generate possible
clusters for the data in Xdata
, using the model in model
, with concentration
parameter alpha
. In the 1D case, Xdata
is assumed to be a 1D array of floats. In
the 2D case, Xdata
is assumed to be a dxN array of floats, where the data is
d-dimensional and N is the number of datapoints.
Returns a list of states. The elements of the list are all states
post-burnin iteration, with the default being a burnin
of 200. By default, this
array is shuffled so that it may be used to approximate I.I.D draws from the
posterior.
A single state from the returned list of states has fields data
and clusters
. data
is a dataframe consisting of the Xdata
and their cluster labels.
clusters
is a data.table (is the user has the data.table package loaded) or a list.
If clusters is a data.table, each row refers to a cluster. Columns are the cluster label, the population, and the rest of the columns are parameters.
If clusters is a list, each element of the list refers to a clsuter, clusters[[i]] is a list containing of the above information for
cluster i as elements. Each single item in clusters is a list with fields cluster
, population
, and params
. E.g. clusters[[1]]$population is the population of cluster 1. The params field (clusters[[i]]$params)
is itself a list of each of the parameters
To see a formatted summary of all the clusters in a given state use the dmm.summarize(clusters)
function.
To see a plot of the labled data in a given state use the dmm.plot(data)
function.
A list of states (i.e. state = states[[i]]
). A state is itself a list.
A state has two fields: data
and clusters
.
data
is a data.frame of the Xdata
data points and their cluster labels.
clusters
is either a list or a data.table (if the data.table package is loaded by the user). It conatins
(1) cluster labels, (2) the number of data points (i.e. population) of each cluster, and (3) all of the parameters for each cluster.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.