Description Usage Arguments Details Value Author(s) References Examples

This function performs clustering on online datasets. The number of cells is data-driven and need not to be chosen in advance by the user.

1 2 3 |

`mydata` |
a matrix where each row corresponds to an observation of length d. |

`R` |
a positive real value that should be larger than the maximum Euclidean distance of all the observations in |

`coeff` |
a positive real value, enforcing large number of cells. The default, 2, should be convenient for most users. A larger value brings more cells for the clustering. |

`K_max` |
a positive integer indicating the maximum number of cells allowed for the clustering. |

`scaling` |
logical indicating whether the matrix |

`var_ind` |
logical indicating whether predicted centers of cells will be calculated sequentially. If |

`N_iterations` |
a positive integer indicating the number of iterations of algorithm. |

`plot_ind` |
logical indicating whether clusters should be plotted. |

`axis_ind` |
numeric indicating which axes are to be plotted if d >= 2. The default is the first two coordinates of observations. |

The PACBO algorithm is introduced and fully described in Le Li, Benjamin Guedj, Sebastien Loustau (2016), "PAC-Bayesian Online Clustering" (https://arxiv.org/abs/1602.00522). It relies on PAC-Bayesian approach, allowing for a dynamic (*i.e.,* time-dependent) estimation of the number of clusters, up to `K_max`

clusters. Its implementation is done via an RJMCMC-flavored algorithm.

Returns a list including

`predicted_centers` |
a matrix of predicted centers of cells, where each row corresponds to a center. |

`nb_of_clusters` |
positive integer indicating the estimation of the number of cells for the dataset. |

`labels` |
labels for observations in |

Le Li <le@iadvize.com>

Le Li, Benjamin Guedj and Sebastien Loustau (2016), PAC-Bayesian Online Clustering, arXiv preprint: https://arxiv.org/abs/1602.00522.

1 2 3 4 5 6 7 8 9 10 11 12 | ```
## generating 4 clusters of 100 points in \strong{R}^{5}.
set.seed(100)
Nb <- 4
d <- 5
T <- 100
proportion = rep(1/Nb, Nb)
Mean_vectors <- matrix(runif(d*Nb,min=-10, max=10),nrow=Nb,ncol=d, byrow=TRUE)
mydata <- matrix(replicate(T, rmnorm(1, mean= Mean_vectors[sample(1:Nb, 1, prob = proportion),],
varcov = diag(1,d))), nrow = T, byrow=T)
R <- max(sqrt(rowSums(mydata^2)))
##run the algorithm.
result <- PACBO(mydata, R, plot_ind = TRUE)
``` |

```
Loading required package: mnormt
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.