This function implements the variable selection in discriminant analysis using a lasso ranking on the variables as described in Sedki et al (2014). The variable ranking step uses the penalized EM algorithm of Zhou et al (2009) (adapted in Sedki et al (2014) for the discriminant analysis settings). A testing sample can be used to compute the averaged classification error rate.

1 2 | ```
SelvarLearnLasso(x, z, lambda, rho, type, rank, hsize, models,
rmodel, imodel, xtest, ztest, nbcores)
``` |

`x` |
matrix containing quantitative data. Rows correspond to observations and columns correspond to variables |

`z` |
an integer vector or a factor corresponding to labels of data. |

`lambda` |
numeric listing of tuning parameters for |

`rho` |
numeric listing of tuning parameters for |

`type` |
character defining the type of ranking procedure, must be "lasso" or "likelihood". Default is "lasso" |

`rank` |
integer listing the rank of variables with (the length of this vector must be equal to the number of variables in the dataset) |

`hsize` |
optional parameter make less strength the forward and backward
algorithms to select |

`models` |
a Rmixmod [ |

`rmodel` |
list of character defining the covariance matrix form for
the linear regression of |

`imodel` |
list of character defining the covariance matrix form for
independent variables |

`xtest` |
matrix containing quantitative testing data. Rows correspond to observations and columns correspond to variables |

`ztest` |
an integer vector or a factor of size number of testing observations. Each cell corresponds to a cluster affectation |

`nbcores` |
number of CPUs to be used when parallel computing is used (default is 2) |

`S ` |
The selected set of relevant clustering variables |

`R ` |
The selected subset of regressors |

`U ` |
The selected set of redundant variables |

`W ` |
The selected set of independent variables |

`criterionValue` |
The criterion value for the selected model |

`model` |
The selected covariance model |

`rmodel` |
The selected covariance form for the regression |

`imodel` |
The selected covariance form for the independent variables |

`parameters` |
Rmixmod [ |

`regparameters` |
Matrix containing all regression coefficients, each column is the regression coefficients of one redundant variable on the selected R set |

`proba` |
Optional : matrix containing the conditional probabilities of belonging to each cluster for the testing observations |

`partition` |
Optional: vector containing the cluster assignments of the testing observations according to the Maximum-a-Posteriori rule. When testing dataset is missed, we use the training dataset as testing one |

`error ` |
Optional : error rate done by the predicted partition (obtained using Maximum-A-Posteriori rule). When testing dataset is missed, we use the training dataset as testing one |

Mohammed Sedki <mohammed.sedki@u-psud.fr>

Zhou, H., Pan, W., and Shen, X., 2009. "Penalized model-based clustering with unconstrained covariance matrices". Electronic Journal of Statistics, vol. 3, pp.1473-1496.

Maugis, C., Celeux, G., and Martin-Magniette, M. L., 2009. "Variable selection in model-based clustering: A general variable role modeling". Computational Statistics and Data Analysis, vol. 53/11, pp. 3872-3882.

Sedki, M., Celeux, G., Maugis-Rabusseau, C., 2014. "SelvarMix: A R package for variable selection in model-based clustering and discriminant analysis with a regularization approach". Inria Research Report available at http://hal.inria.fr/hal-01053784

SelvarClustLasso SortvarLearn SortvarClust wine

1 2 3 4 5 6 7 8 9 10 11 12 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.