Description Usage Arguments Value Author(s) References See Also Examples
This function implements the variable selection in model-based clustering using a lasso ranking on the variables as described in Sedki et al (2014). The variable ranking step uses the penalized EM algorithm of Zhou et al (2009).
1 2 | SelvarClustLasso(x, nbcluster, lambda, rho, type, rank, hsize, criterion,
models, rmodel, imodel, nbcores)
|
x |
matrix or data frame containing quantitative data. Rows correspond to observations and columns correspond to variables |
nbcluster |
numeric listing of the number of clusters (must be positive integers) |
lambda |
numeric listing of the tuning parameters for \ell_1 mean penalty |
rho |
numeric listing of the tuning parameters for \ell_1 precision matrix penalty |
type |
character defining the type of ranking procedure, must be "lasso" or "likelihood". Default is "lasso" |
rank |
integer listing the rank of variables with (the length this vector must be equal to the number of variables in the dataset) |
hsize |
optional parameter make less strength the forward and backward algorithms to select S and W sets |
criterion |
list of character defining the criterion to select the best model. The best model is the one with the highest criterion value. Possible values: "BIC", "ICL", c("BIC", "ICL"). Default is "BIC" |
models |
a Rmixmod [ |
rmodel |
list of character defining the covariance matrix form for the linear regression of U on the R set of variables. Possible values: "LI" for spherical form, "LB" for diagonal form and "LC" for general form. Possible values: "LI", "LB", "LC", c("LI", "LB"), c("LI", "LC"), c("LB", "LC") and c("LI", "LB", "LC"). Default is c("LI", "LB", "LC") |
imodel |
list of character defining the covariance matrix form for independent variables W. Possible values: "LI" for spherical form and "LB" for diagonal form. Possible values: "LI", "LB", c("LI", "LB"). Default is c("LI", LB") |
nbcores |
number of CPUs to be used when parallel computing is used (default is 2) |
for each criterion BIC or ICL
S |
The selected set of relevant clustering variables |
R |
The selected subset of regressors |
U |
The selected set of redundant variables |
W |
The selected set of independent variables |
criterionValue |
The criterion value for the selected model |
nbcluster |
The selected number of clusters |
model |
The selected Gaussian mixture form |
rmodel |
The selected covariance form for the regression |
imodel |
The selected covariance form for the independent Gaussian distribution |
parameters |
Rmixmod [ |
regparameters |
Matrix containing all regression coefficients, each column is the regression coefficients of one redundant variable on the selected R set |
proba |
Matrix containing the conditional probabilities of belonging to each cluster for all observations |
partition |
Vector of length n containing the cluster assignments of the n observations according to the Maximum-a-Posteriori rule |
Mohammed Sedki <mohammed.sedki@u-psud.fr>
Zhou, H., Pan, W., and Shen, X., 2009. "Penalized model-based clustering with unconstrained covariance matrices". Electronic Journal of Statistics, vol. 3, pp.1473-1496.
Maugis, C., Celeux, G., and Martin-Magniette, M. L., 2009. "Variable selection in model-based clustering: A general variable role modeling". Computational Statistics and Data Analysis, vol. 53/11, pp. 3872-3882.
Sedki, M., Celeux, G., Maugis-Rabusseau, C., 2014. "SelvarMix: A R package for variable selection in model-based clustering and discriminant analysis with a regularization approach". Inria Research Report available at http://hal.inria.fr/hal-01053784
SelvarLearnLasso SortvarClust SortvarLearn wine
1 2 3 4 5 6 7 8 9 10 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.