Home

/

GitHub

/

SelvarClustLasso: Regularization for variable selection in model-based...

SelvarClustLasso: Regularization for variable selection in model-based...
In masedki/SelvarMix: Regularization for variable selection in model-based clustering and discriminant analysis.

Description Usage Arguments Value Author(s) References See Also Examples

This function implements the variable selection in model-based clustering using a lasso ranking on the variables as described in Sedki et al (2014). The variable ranking step uses the penalized EM algorithm of Zhou et al (2009).

1 2	SelvarClustLasso(data, nbCluster, lambda, rho, hybrid.size, criterion, models, regModel, indepModel, nbCores)

`data`	matrix containing quantitative data. Rows correspond to observations and columns correspond to variables
`nbCluster`	numeric listing of the number of clusters (must be positive integers)
`lambda`	numeric listing of the tuning parameter for \ell_1 mean penalty
`rho`	numeric listing of the tuning parameter for \ell_1 precision matrix penalty
`hybrid.size`	optional parameter make less strength the hybrid forward and backward algorithms to select S and W sets
`criterion`	list of character defining the criterion to select the best model. The best model is the one with the highest criterion value. Possible values: "BIC", "ICL", c("BIC", "ICL"). Default is "BIC"
`models`	a Rmixmod [`Model`] object defining the list of models to run. The models Gaussian_pk_L_C, Gaussian_pk_Lk_C, Gaussian_pk_L_Ck, and Gaussian_pk_Lk_Ck are called by default (see mixmodGaussianModel() in Rmixmod package to specify other models)
`regModel`	list of character defining the covariance matrix form for the linear regression of U on the R set of variables. Possible values: "LI" for spherical form, "LB" for diagonal form and "LC" for general form. Possible values: "LI", "LB", "LC", c("LI", "LB"), c("LI", "LC"), c("LB", "LC") and c("LI", "LB", "LC"). Default is c("LI", "LB", "LC")
`indepModel`	list of character defining the covariance matrix form for independent variables W. Possible values: "LI" for spherical form and "LB" for diagonal form. Possible values: "LI", "LB", c("LI", "LB"). Default is c("LI", LB")
`nbCores`	number of CPUs to be used when parallel computing is utilized (default is 2)

for each criterion BIC or ICL

`S`	The selected set of relevant clustering variables
`R`	The selected subset of regressors
`U`	The selected set of redundant variables
`W`	The selected set of independent variables
`criterionValue`	The criterion value for the selected model
`nbCluster`	The selected number of clusters
`model`	The selected Gaussian mixture form

`regModel`	The selected covariance form for the regression
`indepModel`	The selected covariance form for the independent gaussian distribution
`proba`	Matrix containing the conditional probabilities of belonging to each cluster for all observations
`partition`	Vector of length n containing the cluster assignments of the n observations according to the Maximum-a-Posteriori rule

Mohammed Sedki <mohammed.sedki@u-psud.fr>

Zhou, H., Pan, W., and Shen, X., 2009. "Penalized model-based clustering with unconstrained covariance matrices". Electronic Journal of Statistics, vol. 3, pp.1473-1496.

Maugis, C., Celeux, G., and Martin-Magniette, M. L., 2009. "Variable selection in model-based clustering: A general variable role modeling". Computational Statistics and Data Analysis, vol. 53/11, pp. 3872-3882.

Sedki, M., Celeux, G., Maugis-Rabusseau, C., 2014. "SelvarMix: A R package for variable selection in model-based clustering and discriminant analysis with a regularization approach". Inria Research Report available at http://hal.inria.fr/hal-01053784

SelvarLearnLasso SortvarClust SortvarLearn scenarioCor

## Not run: 
## Simulated data  example as shown in Maugis et al. (2009) 
## n = 2000 observations, p = 14 variables 
require(Rmixmod)
require(glasso)
data(scenarioCor)
data.cor <- scenarioCor[,1:14]

lambda <- seq(20,  100, by = 10)
rho <- seq(1, 2, length=2)
hybrid.size <- 3
nbCluster <-  c(3,4)
criterion <- "BIC"
models <- mixmodGaussianModel(family = "spherical", equal.proportions = TRUE)
regModel <- c("LI","LB","LC")
indepModel <- c("LI","LB")


simulate.cl <- SelvarClustLasso(data.cor, nbCluster, lambda, rho, hybrid.size, 
                                criterion, models, regModel, indepModel)
 

## End(Not run)

masedki/SelvarMix documentation built on May 21, 2019, 12:42 p.m.

masedki/SelvarMix index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

masedki/SelvarMix
Regularization for variable selection in model-based clustering and discriminant analysis.

SelvarClustLasso: Regularization for variable selection in model-based...
In masedki/SelvarMix: Regularization for variable selection in model-based clustering and discriminant analysis.

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to SelvarClustLasso in masedki/SelvarMix...

R Package Documentation

Browse R Packages

We want your feedback!

masedki/SelvarMix Regularization for variable selection in model-based clustering and discriminant analysis.

SelvarClustLasso: Regularization for variable selection in model-based... In masedki/SelvarMix: Regularization for variable selection in model-based clustering and discriminant analysis.

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to SelvarClustLasso in masedki/SelvarMix...

R Package Documentation

Browse R Packages

We want your feedback!

masedki/SelvarMix
Regularization for variable selection in model-based clustering and discriminant analysis.

SelvarClustLasso: Regularization for variable selection in model-based...
In masedki/SelvarMix: Regularization for variable selection in model-based clustering and discriminant analysis.