View source: R/sampleCompute.R
computeSampling | R Documentation |
computes sampling on raw data matrix to reduce the number of observations, with generalization step.
computeSampling(
x,
label = NULL,
K = 0,
toKeep = NULL,
sampling.size.max = 3000,
K.max = 20,
kmeans.variance.min = 0.95
)
x |
matrix of raw data (point by line). |
label |
vector of (named) labels. |
K |
number of clusters. If K=0 (default), this number is automatically computed thanks to the Elbow method. |
toKeep |
vector of row.names to keep in the sample (for constrained algorithms). |
sampling.size.max |
maximal number of observations to keep in the sample. |
K.max |
maximal number of clusters (K.Max=20 by default). |
kmeans.variance.min |
elbow method cumulative explained variance > criteria to stop K-search. |
computeSampling computes sampling on raw data matrix to reduce the number of observations, with generalization step.
The function returns a list containing:
selection.ids |
vector of selected row.names in the sample. |
selection.labs |
vector of selected labels in the sample. |
matching |
character specifying the matching for all observations and used for generalization of the clustering result. |
size.max |
maximal number of observations kept in the sample. |
K |
number of clusters. |
dat <- rbind(matrix(rnorm(100, mean = 0, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 2, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 4, sd = 0.3), ncol = 2))
tf <- tempfile()
write.table(dat, tf, sep=",", dec=".")
x <- importSample(file.features=tf)
res.sampling <- computeSampling(x$features$initial$x)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.