View source: R/multinomialLogitMix.R
| multinomialLogitMix | R Documentation | 
The main function of the package.
multinomialLogitMix(response, design_matrix, method, 
	Kmax = 10, mcmc_parameters = NULL, em_parameters = NULL, 
	nCores, splitSmallEM = TRUE)
| response | matrix of counts. | 
| design_matrix | design matrix (including constant term). | 
| method | character with two possible values: "EM" or "MCMC" indicating the desired method in order to estimate the model. | 
| Kmax | number of components of the (overfitting) mixture model. | 
| nCores | Total number of CPU cores for parallel processing. | 
| mcmc_parameters | List with the parameter set-up of the MCMC sampler. See details for changing the defaults. | 
| em_parameters | List with the parameter set-up of the EM algorithm. See details for changing the defaults. | 
| splitSmallEM | Boolean value, indicating whether the split-small EM scheme should be used to initialize the  | 
The details of the parameter setup of the EM algorithm and MCMC sampler. The following specification correspond to the minimal default settings. Larger values of tsplit will result to better performance. 
em_parameters <- list(maxIter = 100, emthreshold = 1e-08, maxNR = 10, tsplit = 16, msplit = 10, split = TRUE, R0 = 0.1, plotting = TRUE)
mcmc_parameters <- list(tau = 0.00035, nu2 = 100, mcmc_cycles = 2600, iter_per_cycle = 20, nChains = 8, dirPriorAlphas = c(1, 1 + 5 * exp((seq(2, 14, length = nChains - 1)))/100)/(200), warm_up = 48000, checkAR = 500, probsSave = FALSE, showGraph = 100, ar_low = 0.15, ar_up = 0.25, burn = 100, thin = 1, withRandom = TRUE)
| EM | List with the results of the EM algorithm. | 
| MCMC_raw | List with the raw output of the MCMC sampler - not identifiable MCMC output. | 
| MCMC_post_processed | Post-processed MCMC, used for the inference. | 
Panagiotis Papastamoulis
Papastamoulis, P. Model based clustering of multinomial count data. Advances in Data Analysis and Classification (2023). https://doi.org/10.1007/s11634-023-00547-5
#	Generate synthetic data
	K <- 2	#number of clusters
	p <- 2	#number of covariates (constant incl)
	D <- 5	#number of categories
	n <- 20 #generated number of observations
	set.seed(1)
	simData <- simulate_multinomial_data(K = K, p = p, D = D, n = n, size = 20, prob = 0.025)   
	# EM parameters
em_parameters <- list(maxIter = 100, emthreshold = 1e-08, 
    maxNR = 10, tsplit = 16, msplit = 10, split = TRUE, 
    R0 = 0.1, plotting = TRUE)
	#  MCMC parameters - just for illustration
	#	typically, set `mcmc_cycles` and `warm_up`to a larger values
	#	such as` mcmc_cycles = 2500` or more 
	#	and `warm_up = 40000` or more.
	nChains <- 2 #(set this to a larger value, such as 8 or more)
	mcmc_parameters <- list(tau = 0.00035, nu2 = 100, mcmc_cycles = 260, 
	    iter_per_cycle = 20, nChains = nChains, dirPriorAlphas = c(1, 
		1 + 5 * exp((seq(2, 14, length = nChains - 1)))/100)/(200), 
	    warm_up = 4800, checkAR = 500, probsSave = FALSE, 
	    showGraph = 100, ar_low = 0.15, ar_up = 0.25, burn = 100, 
	    thin = 1, withRandom = TRUE)
	# run EM with split-small-EM initialization, and then use the output to 
	#	initialize MCMC algorithm for an overfitting mixture with 
	#	Kmax = 5 components (max number of clusters - usually this is 
	#	set to a larger value, e.g. 10 or 20).
	#	Note: 
	#		1. the MCMC output is based on the non-empty components
	#		2. the EM algorithm clustering corresponds to the selected 
	#			number of clusters according to ICL.
	#		3. `nCores` should by adjusted according to your available cores.
	
	mlm <- multinomialLogitMix(response = simData$count_data, 
		design_matrix = simData$design_matrix, method = "MCMC", 
             Kmax = 5, nCores = 2, splitSmallEM = TRUE, 
             mcmc_parameters = mcmc_parameters, em_parameters = em_parameters)
	# retrieve clustering according to EM
	mlm$EM$estimated_clustering
	# retrieve clustering according to MCMC
	mlm$MCMC_post_processed$cluster	
	
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.