| mixture_sim | R Documentation |
Simulation of a n \times p data frame according to a mixture of q
Gaussian distributions with q < p, different location parameters
\mu_1, \dots, \mu_q, and the identity matrix as the covariance matrix.
mixture_sim(pct_clusters = c(0.5, 0.5), n = 500, p = 10, delta = 10)
pct_clusters |
a vector of marginal probabilities for each group, i.e mixture weights. Default is two balanced clusters. |
n |
integer. The number of observations. |
p |
integer. The number of variables. |
delta |
integer. The location shift. |
Let X be a p-variate real random vector distributed according to
a mixture of q Gaussian distributions with q < p,
different location parameters \mu_1, \dots, \mu_q, and the same positive
definite covariance matrix I_p:
X \sim \sum_{h=1}^{q} \epsilon_h \, {\cal N}(\mu_h,I_p),
where \epsilon_{1}, \dots, \epsilon_{q} are mixture weights with
\epsilon_1 + \cdots + \epsilon_q = 1, \mu_1 = 0_p,
and \mu_{h+1} = \delta e_h with h = 1, \dots, q-1.
A dataframe of n observations and p+1 variables with the first variable indicating the cluster assignment using a character string.
Aurore Archimbaud
Alfons, A., Archimbaud, A., Nordhausen, K., & Ruiz-Gazen, A. (2024). Tandem clustering with invariant coordinate selection. Econometrics and Statistics. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ecosta.2024.03.002")}.
X <- mixture_sim()
summary(X)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.