Description Usage Arguments Details Value References See Also Examples

The adaptive *L_1* penalty was proposed by Pan and Shen (2007). Under the framework of the model-based clustering, APL1 aims to identify the globally informative variables for clustering high-dimensional data.

1 2 3 4 | ```
apL1(tuning, K = NULL, lambda = NULL, y, N = 100, kms.iter = 100, kms.nstart = 100,
adapt.kms = FALSE, eps.diff = 1e-5, eps.em = 1e-5, model.crit = 'gic')
apL1(tuning = NULL, K, lambda, y, N = 100, kms.iter = 100, kms.nstart = 100,
adapt.kms = FALSE, eps.diff = 1e-5, eps.em = 1e-5, model.crit = 'gic')
``` |

`tuning` |
A 2-dimensional vector or a matrix with 2 columns, the first column is the number of clusters |

`K` |
The number of clusters |

`lambda` |
The tuning parameter |

`y` |
A p-dimensional data matrix. Each row is an observation. |

`N` |
The maximum number of iterations in the EM algorithm. The default value is 100. |

`kms.iter` |
The maximum number of iterations in kmeans algorithm for generating the starting value for the EM algorithm. |

`kms.nstart` |
The number of starting values in K-means. |

`adapt.kms` |
A indicator of using the cluster means estimated by K-means to calculate the adaptive parameters in APFP. The default value is FALSE. |

`eps.diff` |
The lower bound of pairwise difference of two mean values. Any value lower than it is treated as 0. |

`eps.em` |
The lower bound for the stopping criterion. |

`model.crit` |
The criterion used to select the number of clusters |

A variable is defined as globally informative if there exists at least one pair of clusters such that *μ_{kj} \neq μ_{k'j}*. Here we assume that each cluster has the same diagonal variance in the model-based clustering. APL1 is in the following form,

*∑_{j=1}^d τ_{kj}∑_{k=1}^K |μ_{kj}|,*

where *d* is the number of variables in the data, K is the number of clusters, *τ_{kj} = \tilde{μ}_{kj}* is the adaptive parameters. Here we provide two choices for *τ_{kj}*. If `adapt.kms == TRUE`

, *\tilde{μ}_{kj}* is the estimates from the K-mean algorithm; otherwise, *\tilde{μ}_{kj}* is the estimates from the model-based clustering without penalty.

The EM algorithm is used for estimating parameters. Since the EM algorithm depends on the starting values. We use the estimates from K-means with multiple starting points as the starting values.

This function returns the esimated parameters and some statistics of the optimal model within the given *K* and *λ*, which is selected by BIC when `model.crit = 'bic'`

or GIC when `model.crit = 'gic'`

.

`mu.hat.best` |
The estimated cluster means in the optimal model |

`sigma.hat.best` |
The estimated covariance in the optimal model |

`p.hat.best` |
The estimated cluster proportions in the optimal model |

`s.hat.best` |
The clustering assignments using the optimal model |

`lambda.best` |
The value of |

`K.best` |
The value of |

`llh.best` |
The log-likelihood of the optimal model |

`gic.best` |
The GIC of the optimal model |

`bic.best` |
The BIC of the optimal model |

`ct.mu.best` |
The degrees of freedom in the cluster means of the optimal model |

Pan, W. and Shen, X. (2007). Penalized model-based clustering with application to variable selection. *The Journal of Machine Learning Research* **8**, 1145–1164.

1 2 3 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.