Description Usage Arguments Details Value Note Author(s) References Examples
Given a possibly multi-modal and multi-dimensional clustering target density function and a temperature ladder this function produces samples from the target using the evolutionary Monte Carlo clustering (EMCC) algorithm.
Below sampDim
refers to the dimension of the sample space,
temperLadderLen
refers to the length of the temperature ladder,
and levelsSaveSampForLen
refers to the length of the
levelsSaveSampFor
.
1 2 3 4 5 6 7 8 9 10 11 |
nIters |
|
temperLadder |
|
startingVals |
|
logTarDensFunc |
|
MHMergeProb |
|
moveProbsList |
named |
moveNTimesList |
named |
levelsSaveSampFor |
|
saveFitness |
|
verboseLevel |
|
... |
optional arguments to be passed to
|
The evolutionary Monte Carlo clustering (EMCC; Goswami and Liu, 2007) algorithm is composed of the following moves:
MH | Metropolis-Hastings or mutation |
SCSC_ONE_NEW | sub-cluster swap crossover: one new |
SCSC_TWO_NEW | sub-cluster swap crossover: two new |
SCRC | sub-cluster reallocation crossover |
RE | (random) exchange |
The current function could be used to run the EMCC algorithm by specifying what moves to employ using the following variables.
moveProbsList
and moveNTimesList
The allowed
names for components of moveProbsList
and
moveNTimesList
come from the abbreviated names of the
moves above. For example, the following specifications are
valid:
1 2 3 4 | moveProbsList = list(MH = 0.5,
SCSC_TWO_NEW = 0.25,
SCRC = 0.25)
|
1 2 3 4 5 |
MHMergeProb
In the MH or the mutation step,
each of the sampDim
-many objects are proposed to either
merge with an existing cluster or split to form its own cluster
with probability MHMergeProb
and (1 - MHMergeProb
),
respectively (see Goswami and Liu, 2007).
levelsSaveSampFor
By default, samples are saved and
returned for temperature level temperLadderLen
. The
levelsSaveSampFor
could be used to save samples from other
temperature levels as well (e.g., levelsSaveSampFor =
1:temperLadderLen
saves samples from all levels).
saveFitness
The term fitness refers to the function H(x), where the target density of interest is given by:
g(x) \propto \exp[ -H(x) / τ_{min} ]
H(x) is also known as the energy function. By default,
the fitness values are not saved, but one can do so by setting
saveFitness = TRUE
.
This function returns a list with the following components:
draws |
|
acceptRatios |
|
detailedAcceptRatios |
|
nIters |
the |
temperLadder |
the |
startingVals |
the |
moveProbsList |
the |
moveNTimesList |
the |
levelsSaveSampFor |
the |
time |
the time taken by the run. |
The effect of leaving the default value NULL
for some of the
arguments above are as follows:
moveProbsList
| list(MH = 0.5, RC = 0.25, 'SCSC_TWO_NEW' = 0.25) .
|
moveNTimesList
| list(MH = 1, RC = mm, 'SCSC_TWO_NEW' = mm, RE = nn) ,
|
where mm <- floor(nn / 2) and nn <- temperLadderLen .
|
|
levelsSaveSampFor
| temperLadderLen .
|
Gopi Goswami goswami@stat.harvard.edu
Gopi Goswami, Jun S. Liu and Wing H. Wong (2007). Evolutionary Monte Carlo Methods for Clustering. Journal of Computational and Graphical Statistics, 16:4:855-876.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ## The following example is a simple stochastic optimization problem,
## the set up is same as that of findMaxTemper and placeTempers. Here
## no "heating up" is necessary, and hence the maximum temprature is
## the coldest one, namely, 0.5.
##
## However, we run evolMonteCarloClustering on this example with a
## temperature ladder that is the output of placeTempers, which
## assumes that the maximum temperature is 5.
KMeansObj <- KMeansFuncGenerator1(-97531)
samplerObj <-
with(KMeansObj,
{
temperLadder <- c(5.0000000, 1.5593974, 1.1028349, 0.9220684,
0.7900778, 0.6496648, 0.5135825, 0.5000000)
nLevels <- length(temperLadder)
sampDim <- nrow(yy)
startingVals <- sample(c(0, 1),
size = nLevels * sampDim,
replace = TRUE)
startingVals <- matrix(startingVals, nrow = nLevels, ncol = sampDim)
moveProbsList <- list(MH = 0.4,
RC = 0.3,
'SCSC_TWO_NEW' = 0.3)
mm <- floor(nLevels / 2)
moveNTimesList <- list(MH = 1,
RC = mm,
'SCSC_TWO_NEW' = mm,
RE = nLevels)
evolMonteCarloClustering(nIters = 100,
temperLadder = temperLadder,
startingVals = startingVals,
logTarDensFunc = logTarDensFunc,
moveProbsList = moveProbsList,
moveNTimesList = moveNTimesList,
levelsSaveSampFor = seq_len(nLevels),
saveFitness = TRUE,
verboseLevel = 1)
})
print(samplerObj)
print(names(samplerObj))
with(c(samplerObj, KMeansObj),
{
print(acceptRatios)
print(detailedAcceptRatios)
print(dim(draws))
fitnessCol <- ncol(draws[ , , 1])
sub <- paste('uniform prior on # of clusters: DU[',
priorMinClusters, ', ',
priorMaxClusters, ']', sep = '')
for (ii in rev(seq_along(levelsSaveSampFor))) {
main <- paste('EMCC (MAP) clustering (temper = ',
round(temperLadder[levelsSaveSampFor[ii]], 3), ')',
sep = '')
MAPRow <- which.min(draws[ , fitnessCol, ii])
clusterPlot(clusterInd = draws[MAPRow, -fitnessCol, ii],
data = yy,
main = main,
sub = sub,
knownClusterMeans = knownClusterMeans)
}
})
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.