chips | R Documentation |
This function provides a partition to a subset of items which has high marginal probability based on samples from a partition distribution using the conditional high inclusion probability subset (CHIPS) partition greedy search method (Barrientos, Page, Dahl, Dunson, 2024).
chips(
partitions,
threshold = 0,
nRuns = 64,
intermediateResults = identical(threshold, 0),
allCandidates = FALSE,
andSALSO = !intermediateResults && !allCandidates,
loss = VI(a = 1),
maxNClusters = 0,
initialPartition = integer(0),
nCores = 0
)
partitions |
A |
threshold |
The minimum marginal probability for the subpartition. Values closer to 1.0 will yield a partition of fewer items and values closer to 0.0 will yield a partition of more items. |
nRuns |
The number of runs to try, where the best result is returned. |
intermediateResults |
Should intermediate subset partitions be returned? |
allCandidates |
Should all the final subset partitions from multiple runs be returned? |
andSALSO |
Should the resulting incomplete partition be completed using SALSO? |
loss |
When |
maxNClusters |
The maximum number of clusters that can be considered by
SALSO, which has important implications for the interpretability of the
resulting clustering and can greatly influence the RAM needed for the
optimization algorithm. If the supplied value is zero, the optimization is
constrained by the maximum number of clusters among the clusterings in
|
initialPartition |
An vector of length |
nCores |
The number of CPU cores to use, i.e., the number of simultaneous runs at any given time. A value of zero indicates to use all cores on the system. |
A list containing:
chips_partition
: If intermediateResults
is FALSE
, an integer vector giving the
estimated subset partition, encoded using cluster labels with -1
indicating not allocated. If TRUE
, an integer matrix with intermediate subset
partitions in the rows.
n_items
: Number of items in the estimated subset partition.
probability
: Monte Carlo estimate of the probability of the subset partition.
auc
: If intermediateResults
is TRUE
, this element is provided and gives
the area under the probability curve as a function of the number of clusters
after scaling to be between 0 and 1.
chips_and_salso_partition
: If andSALSO
is TRUE
, this element is provided and
gives an integer vector giving the
estimated partition of all items based on CHiPS until the threshold
is met
and using SALSO to allocate the rest.
# For examples, use 'nCores = 1' per CRAN rules, but in practice omit this.
data(iris.clusterings)
draws <- iris.clusterings
all <- chips(draws, nRuns = 1, nCores = 1)
plot(all$n_items, all$probability)
x <- chips(draws, threshold = 0.5, nCores = 1)
table(x$chips_partition)
which(x$chips_partition != -1)
x
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.