View source: R/semisupervised.R
computeSemiSupervised | R Documentation |
Perform semi-supervised clustering based on pairwise constraints, dealing with the number of clusters K, automatically or not.
computeSemiSupervised( data.sample, ML, CNL, K = 0, kmax = 20, method.name = "Constrained_KM", maxIter = 2, pca = FALSE, pca.nb.dims = 0, spec = FALSE, use.sampling = FALSE, sampling.size.max = 0, scaling = FALSE, RclusTool.env = initParameters(), echo = TRUE )
data.sample |
list containing features, profiles and clustering results. |
ML |
list of ML (must-link) constrained pairs (as row.names of features). |
CNL |
list of CNL (cannot-link) constrained pairs (as row.names of features). |
K |
number of clusters. If K=0 (default), this number is automatically computed thanks to the Elbow method. |
kmax |
maximum number of clusters. |
method.name |
character vector specifying the constrained algorithm to use. Must be 'Constrained_KM' (default) or 'Constrained_SC' (Constrained Spectral Clustering). |
maxIter |
number of iterations for SemiSupervised algorithm |
pca |
boolean: if TRUE, Principal Components Analysis is applied to reduce the data space. |
pca.nb.dims |
number of principal components kept. If pca.nb.dims=0, this number is computed automatically. |
spec |
boolean: if TRUE, spectral embedding is applied to reduce the data space. |
use.sampling |
boolean: if FALSE (default), data sampling is not used. |
sampling.size.max |
numeric: maximal size of the sampling set. |
scaling |
boolean: if TRUE, scaling is applied. |
RclusTool.env |
environment in which data and intermediate results are stored. |
echo |
boolean: if FALSE (default), no description printed in the console. |
computeSemiSupervised performs semi-supervised clustering based on pairwise constraints, dealing with the number of clusters K, automatically or not
The function returns a list containing:
label |
vector of labels. |
summary |
data.frame containing clusters summaries (min, max, sum, average, sd). |
nbItems |
number of observations. |
computeCKmeans
, computeCSC
, KwaySSSC
dat <- rbind(matrix(rnorm(100, mean = 0, sd = 0.3), ncol = 2), matrix(rnorm(100, mean = 2, sd = 0.3), ncol = 2), matrix(rnorm(100, mean = 4, sd = 0.3), ncol = 2)) tf <- tempfile() write.table(dat, tf, sep=",", dec=".") x <- importSample(file.features=tf) pairs.abs <- visualizeSampleClustering(x, selection.mode = "pairs", profile.mode="whole sample", wait.close=TRUE) res.ckm <- computeSemiSupervised(x, ML=pairs.abs$ML, CNL=pairs.abs$CNL, K=0) plot(dat[,1], dat[,2], type = "p", xlab = "x", ylab = "y", col = res.ckm$label, main = "Constrained K-means clustering")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.