This function implements the variable selection in model-based clustering using a lasso ranking on the variables as described in Sedki et al (2014). The variable ranking step uses the penalized EM algorithm of Zhou et al (2009).

1 2 | ```
SelvarClustLasso(x, nbcluster, lambda, rho, type, rank, hsize, criterion,
models, rmodel, imodel, nbcores)
``` |

`x` |
matrix or data frame containing quantitative data. Rows correspond to observations and columns correspond to variables |

`nbcluster` |
numeric listing of the number of clusters (must be positive integers) |

`lambda` |
numeric listing of the tuning parameters for |

`rho` |
numeric listing of the tuning parameters for |

`type` |
character defining the type of ranking procedure, must be "lasso" or "likelihood". Default is "lasso" |

`rank` |
integer listing the rank of variables with (the length this vector must be equal to the number of variables in the dataset) |

`hsize` |
optional parameter make less strength the forward and backward
algorithms to select |

`criterion` |
list of character defining the criterion to select the best model. The best model is the one with the highest criterion value. Possible values: "BIC", "ICL", c("BIC", "ICL"). Default is "BIC" |

`models` |
a Rmixmod [ |

`rmodel` |
list of character defining the covariance matrix form for
the linear regression of |

`imodel` |
list of character defining the covariance matrix form for
independent variables |

`nbcores` |
number of CPUs to be used when parallel computing is used (default is 2) |

for each criterion BIC or ICL

`S ` |
The selected set of relevant clustering variables |

`R ` |
The selected subset of regressors |

`U ` |
The selected set of redundant variables |

`W ` |
The selected set of independent variables |

`criterionValue` |
The criterion value for the selected model |

`nbcluster` |
The selected number of clusters |

`model` |
The selected Gaussian mixture form |

`rmodel ` |
The selected covariance form for the regression |

`imodel` |
The selected covariance form for the independent Gaussian distribution |

`parameters` |
Rmixmod [ |

`regparameters` |
Matrix containing all regression coefficients, each column is the regression coefficients of one redundant variable on the selected R set |

`proba` |
Matrix containing the conditional probabilities of belonging to each cluster for all observations |

`partition` |
Vector of length |

Mohammed Sedki <mohammed.sedki@u-psud.fr>

Zhou, H., Pan, W., and Shen, X., 2009. "Penalized model-based clustering with unconstrained covariance matrices". Electronic Journal of Statistics, vol. 3, pp.1473-1496.

Maugis, C., Celeux, G., and Martin-Magniette, M. L., 2009. "Variable selection in model-based clustering: A general variable role modeling". Computational Statistics and Data Analysis, vol. 53/11, pp. 3872-3882.

Sedki, M., Celeux, G., Maugis-Rabusseau, C., 2014. "SelvarMix: A R package for variable selection in model-based clustering and discriminant analysis with a regularization approach". Inria Research Report available at http://hal.inria.fr/hal-01053784

SelvarLearnLasso SortvarClust SortvarLearn wine

1 2 3 4 5 6 7 8 9 10 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.