| LNM.clust | R Documentation | 
This function is part of the main clustering function for the proposed algorithm. It fit the proposed LNM_MM model for a specific number of component G_act. If one wants to fit G for 1:5, use the paralleled function or put this function into a loop for G.
LNM.clust(
  data,
  run,
  G_act,
  initial = "kmeans",
  runtime = TRUE,
  threshold,
  verb = FALSE,
  maxiter = NA,
  nrep = NA,
  niter = NA,
  sim = FALSE
)
| data | Input data here. If sim==TRUE, data should be a list of multiple datasets (indexed by "run"), with each dataset as a list of counts W and true_lab (true class label). If no true label, set true_lab as NAs. If not simulation, data should be as the same format as described for each dataset of the simulation. | 
| run | Keep track of run number of datasets. For simulation this could be the index of the simulated data; for other cases, could run several times too with random initialization to pick the highest BIC/ICL. If only want to run 1 time for 1 dataset, specify run=1. | 
| G_act | Input the current actual running number of parameter. | 
| initial | Specify method for initializing z_ig. Possible values could be "kmeans", "random", "small_EM". Default is "kmeans". | 
| runtime | Logical variable, if outputting the running time of the whole procedure or not. | 
| threshold | Threshold for the Atiken's stopping creterion for convergence. | 
| verb | Logical variable, if the key steps of the algortihm and approximated loglikelihood for each iteration are printed. | 
| maxiter | Maximum number of iteration. If specified, algorithm will stop by either below the threshold or maxiter reached. If not specified, algorithm will only be monitored by convergence criterion. | 
| nrep | Default is NA. Only needed if "small_EM" is specified for initial. Number of random starts for the small EM initialization. | 
| niter | Default is NA. Only needed if "small_EM" is specified for initial. Number of iterations for each random start in the small EM initialization. | 
| sim | Indicator of whether this is simulated data. Simulated data input must as a list of multiple datasets (indexed by "run"), with each dataset must be a list of W and true_lab. Default is FALSE. | 
A list contains the parameters when the algorithm converges. pi_g = estimated class size composition; z = soft class membership posterior probability; mu = estimated mean parameter for the latent variable; Sigma = estimated variance paprameter for the latent variable. Others are internal parameters that could used to check model fit and to pass to overall algorithm for model selection.
# generate data using Data.temp <- generate_data(G = 2, num_observation = c(50,50), K = 2, true_mu = list(c(0,1,0),c(-2,-5,0)),true_Sig=list(rbind(cbind(diag(1,2),0),0),rbind(cbind(diag(1,2),0),0)), seed.no = 1234, M = 10000, truelab = TRUE)
LNM.clust(data=Data.temp,run=1,G_act=2,initial="small_EM",runtime=TRUE,threshold=1e-4,verb=TRUE,nrep=30,niter=50,sim=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.