View source: R/generateSimulationDataset.R
generateSimulationDataset | R Documentation |
Generates a dataset based upon a mixture of $K$ Gaussian distributions with $p$ independent, relevant features and $p_n$ irrelevant features.
generateSimulationDataset(
K,
n,
p,
delta_mu = 1,
cluster_sd = 1,
pi = rep(1/K, K),
p_n = 0
)
K |
The number of components to sample from. |
n |
The number of samples to draw. |
p |
The number of relevant (i.e. signal-bearing) features. |
delta_mu |
The difference between the means defining each component within each feature (defaults to 1). |
cluster_sd |
The standerd deviation of the Gaussian distributions. |
pi |
The K-vector of the populations proportions across each component. |
p_n |
The number of irrelevant features (defaults to 0). |
A list of 'data' (a data.frame of the generated data) and 'cluster_IDs' (a vector of the cluster membership of each item).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.