clusterCVR | R Documentation |
Compute cluster assignment probabilities by a EM algorithm. The required inputs are a numeric data matrix and the number of clusters.
clusterCVR(
data,
user_K = 3,
loglik_thresh = 1e-05,
runs = 1,
n_iter = Inf,
fast = FALSE,
IIA = FALSE,
init = "kmeans",
subset = NULL,
ignore_X = FALSE,
recode_key = NULL,
seed = 2138,
verbose = TRUE,
pi = NULL,
mu = NULL,
zeta_hat = NULL
)
.cluster(data, user_K, seed, n_iter, loglik_thresh, fast, IIA, init, verbose)
data |
the dataset, in list form, with the following slots.
|
user_K |
the number of clusters to presume / compute |
loglik_thresh |
the threshold value for convergence. The EM will stop when the relative change in log likelihood is less than the threshold. |
runs |
Number of replications (with different starting values to run). Default is 1 but more than 1 is highly recommended if computing time is not prohibitive. |
n_iter |
manual limit to iterations |
fast |
summarize data to unique profiles, so estimation is faster? Currently
only possible if IIA = FALSE. Defaults to |
IIA |
assume that the data$y matrix is generated from a varying choice set
as defined by data$m? Defaults to |
init |
method of initialization |
subset |
A vector of row indices or row names to subset all the data by.
Useful when wanting to test a small subset of the data without modifying the
|
ignore_X |
Should X be set to NULL even if it is provided? Useful when
switching between covariates and non-covariates case. Defaults to |
recode_key |
A named vector to be passed on to |
seed |
seed for initialization |
verbose |
Defaults to TRUE. |
pi , mu , zeta_hat |
initial values of the key parameters, if there
are any good guesses. If left |
See fmt_mu_viz for a quick way to visualize the output.
The last iteration
Stored iterations
A list of stored items not specific to iterations. These include the initial values, parameters, total time data, and settings.
A vector of runs
seeds that were used.
A vector of runs
final loglikelihood estimates corresponding
to each run of the model. Only The model with the highest log likelihood is stored.
em_full <- clusterCVR(simdata_full, init = "kmeans", runs = 2)
summary(em_full)
## Not run:
pars <- summ_params(em_full)
graph_trend(pars, simdata_full)
## End(Not run)
em_miss <- clusterCVR(simdata_miss, IIA = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.