FeatureImpCluster: Feature importance for k-means clustering

Description Usage Arguments Value Examples

View source: R/featureimpcluster.R

Description

This function loops through PermMisClassRate for each variable of the data. The mean misclassification rate over all iterations is interpreted as variable importance.

Usage

1
2
3
4
5
6
7
8
FeatureImpCluster(
  clusterObj,
  data,
  basePred = NULL,
  predFUN = NULL,
  sub = 1,
  biter = 10
)

Arguments

clusterObj

a "typical" cluster object. The only requirement is that there must be a prediction function which maps the data to an integer

data

data.table with the same features as the data set used for clustering (or the simply the same data)

basePred

should be equal to results of predFUN(clusterObj,newdata=data); this option saves time when data is a very large data set

predFUN

predFUN(clusterObj,newdata=data) should provide the cluster assignment as a numeric vector; typically this is a wrapper around a build-in prediction function

sub

integer between 0 and 1(=default), indicates that only a subset of the data should be used if <1

biter

the permutation is iterated biter(=5, default) times

Value

A list of

misClassRate

A matrix of the permutation misclassification rate for each variable and each iteration

featureImp

For each row of complete_data, the associated cluster

Examples

1
2
3
4
5
6
7
set.seed(123)
dat <- create_random_data(n=1e3)$data # random data

library(flexclust)
res <- kcca(dat,k=4)
f <- FeatureImpCluster(res,dat)
plot(f)

FeatureImpCluster documentation built on Oct. 20, 2021, 5:06 p.m.