DataGenCKM | R Documentation |
The function generates datasets that follow the typical K-means model with the option of including masking variables - variables that do not contribute to the clusters. As a simplistic version, the current function restricts the means of all signaling variables to be equal within each cluster, while the variance to be equal across all variables and clusters.
DataGenCKM(n.obs, n.cluster, n.validvar, n.noisevar, mu, var, varsplit = 0)
n.obs |
the total number of observations |
n.cluster |
the total number of clusters |
n.validvar |
the total number of signaling variables |
n.noisevar |
the total number of masking variables |
mu |
a vector of length |
var |
a number indicates the variance of each variable |
varsplit |
either 0 or 1 (default value is 0); when 1, the variance of half of the variables equal var/2 |
a list of two elements. The first is the generated dataset while the second is a vector of length n.obs
contains the cluster assignment of each observations
ncluster <- 3 nobs <- 60 nnoisevar <- 100 nvalidvar <- 20 mu <- 1 var <- 1 sim.data <- DataGenCKM(nobs, ncluster, nvalidvar, nnoisevar, mu, var) dataset <- sim.data[[1]] cluster.assign <- sim.data[[2]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.