vscc | R Documentation |
Performs variable selection under a clustering or classification framework. Automated implementation using model-based clustering is based on teigen
version 2.0 and mclust
version 4.0; issues *may* arise when using different versions.
vscc(x, G=1:9, automate = "mclust", initial = NULL, initunc=NULL, train = NULL,
forcereduction = FALSE)
x |
Data frame or matrix to perform variable selection on |
G |
Vector for the number of groups to consider during initialization and/or post-selection analysis. Default is 1-9. |
automate |
Character string ( |
initial |
Optional vector giving the initial clustering. |
initunc |
Optional scalar indicating the total uncertainty of the initial clustering solution. Only used when |
train |
Optional vector of training data (for classification framework). |
forcereduction |
Logical indicating if the full data set should be considered (FALSE) when selecting the ‘best’ variable subset via total model uncertainty. Not used if |
selected |
A list containing the subsets of variables selected for each relation. Each set is numbered according to the number in the exponential of the relationship. For instance, |
family |
The family used as initialization and/or post selection. (Same as user input |
wss |
The within-group variance associated with each variable from the full data set. |
The remaining values are provided as long as automate
is not NULL
:
topselected |
The best variable subset according to the total model uncertainty. |
initialrun |
Results from the initialization; an object of class |
bestmodel |
Results from the best model on the selected variable subset; an object of class |
chosenrelation |
Numeric indication of the relationship chosen according to total model uncertainty. The number corresponds to exponent in the relationship: for instance, a value of '4' suggests the quartic relationship. If the value |
uncertainty |
Total model uncertainty associated with the best relationship. |
allmodelfit |
List containing the results ( |
Jeffrey L. Andrews, Paul D. McNicholas
See citation("vscc")
for the variable selection references. See also citation("teigen")
and citation("mclust")
if using those families of models via the automate
call.
teigen
, Mclust
require("mclust")
data(banknote)
head(banknote)
bankrun <- vscc(banknote[,-1])
head(bankrun$topselected) #Show preview of selected variables
table(banknote[,1], bankrun$initialrun$classification) #Clustering results on full data set
table(banknote[,1], bankrun$bestmodel$classification) #Clustering results on reduced data set
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.