Description Usage Arguments Value Author(s) References Examples
MedSTC is a novel classification algorithm by Prof. Jun Zhu (http://www.mlthu.net/~jun/).
1 2 3 
documents 
A list whose length is equal to the number of documents, D. Each element of documents is an integer matrix with two rows. Each column of documents[[i]] (i.e., document i) represents a word occurring in the document. documents[[i]][1, j] is a 0indexed word identifier for the jth word in document i. documents[[i]][2,j] is an integer specifying the number of times that word appears in the document. 
mlabels 
The training labels for the documents. 
ntopics 
Number of topics to be used in modeling the corpus. 
initial_c, lambda, rho 
These are positivevalued regularization constants. Default values are initial_c=0.5, lambda=0.1, rho=0.01 
delta_ell 
The parameter for the svm cost function, i.e., 0/(delta ell) loss. Only positive values are allowed. Default value is 3600. 
supervised 
If the value is TRUE, the model is a supervised MedSTC; if FALSE, the model is the unsupervised STC. 
primal_svm 
Only works when "supervised" is set at 1. If the value is 1, uses the lossaugmented prediction (i.e., subgradient) to update document codes; otherwise it uses the gradient with Lagrangian multipliers to update document codes. 
var_max_iter 
The maximum number of iterations of coordinate descent for a single document. 
convergence 
The convergence criteria for coordinate descent. Stop if (objective_old  objective) / abs(objective_old) is less than this value (or after the maximum number of iterations). Note that "objective" is the objective value for a single document. 
em_max_iter 
The maximum number of iterations of hierarchical sparse coding, dictionary learning, and svm training (for supervised MedSTC). 
em_convergence 
The convergence criteria for coordinate descent. Stop if (objective_old  objective) / abs(objective_old) is less than this value (or after the maximum number of iterations). Note that "objective" is the objective value for the whole corpus. 
svm_alg_type 
If set to 0 then the nslack multiclass SVM is used. If set to 2, then the 1slack multiclass SVM is used. In our testing, the 1slack SVM is faster. 
output_dir 
A directory for writing intermediate results. Directory is removed after the calculation is done, but is needed during the run. 
model 
A model object of the medSTC class, which has a state list with five elements: The first two list elements are for storing the model parameter state after the model completed training. The third list element is the LogProbabilityOfWordsForTopics, which can be used for word assignments to topics. The fourth and fifth model state list elements are Eta and Mu. (refer to paper) The model also stores the original paramater values. 
Jun Zhu ([email protected]),Aykut Firat ([email protected])
Jun Zhu, and Eric P. Xing. Sparse Topical Coding, In Proc. of 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, 2011.
1  ## Not run: demo(medSTC)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.