Home

/

CRAN

/

EMCC

/

findMaxTemper: Find the maximum temperature for parallel MCMC chains

findMaxTemper: Find the maximum temperature for parallel MCMC chains
In EMCC: Evolutionary Monte Carlo (EMC) Methods for Clustering

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/ladder.R

The evolutionary Monte Carlo clustering (EMCC) algorithm needs a temperature ladder. This function finds the maximum temperature for constructing the ladder.

Below sampDim refers to the dimension of the sample space, temperLadderLen refers to the length of the temperature ladder, and levelsSaveSampForLen refers to the length of levelsSaveSampFor. Note, this function calls evolMonteCarloClustering, so some of the arguments below have the same name and meaning as the corresponding ones for evolMonteCarloClustering. See details below for explanation on the arguments.

findMaxTemper(nIters,
              statsFuncList,
              startingVals,
              logTarDensFunc,
              temperLadder      = NULL,
              temperLimits      = NULL,
              ladderLen         = 10,
              scheme            = 'exponential',
              schemeParam       = 0.5,
              cutoffDStats      = 1.96,
              cutoffESS         = 50,
              guideMe           = TRUE,
              levelsSaveSampFor = NULL,
              saveFitness       = FALSE,
              doFullAnal        = TRUE,
              verboseLevel      = 0,
              ...)

`nIters`	`integer` > 0.
`statsFuncList`	`list` of functions of one argument each, which return the value of the statistic evaluated at one MCMC sample or draw.
`startingVals`	`double` matrix of dimension `temperLadderLen` x `sampDim` or vector of length `sampDim`, in which case the same starting values are used for every temperature level.
`logTarDensFunc`	`function` of two arguments `(draw, ...)` that returns the target density evaluated in the log scale.
`temperLadder`	`double` vector with all positive entries, in decreasing order.
`temperLimits`	`double` vector with two positive entries.
`ladderLen`	`integer` > 0.
`scheme`	`character`.
`schemeParam`	`double` > 0.
`cutoffDStats`	`double` > 0.
`cutoffESS`	`double` > 0.
`guideMe`	`logical`.
`levelsSaveSampFor`	`integer` vector with positive entries.
`saveFitness`	`logical`.
`doFullAnal`	`logical`.
`verboseLevel`	`integer`, a value >= 2 produces a lot of output.
`...`	optional arguments to be passed to `logTarDensFunc`, `MHPropNewFunc` and `logMHPropDensFunc`.

This function is based on the method to find the temperature range introduced in section 4.1 of Goswami and Liu (2007).

statsFuncList: The user specifies this list of functions, each of which is known to be sensitive to the presence of modes. For example, if both dimension 1 and 3 (i.e., objects 1 and 3) are sensitive to presence of modes, then one could use:

1 2	coord1 <- function (xx) { xx[1] }

1 2	coord3 <- function (xx) { xx[3] }

1 2	statsFuncList <- list(coord1, coord3)

temperLadder: This is the temperature ladder needed for the first stage preliminary run. One can either specify a temperature ladder via temperLadder or specify temperLimits, ladderLen, scheme and schemeParam. For details on the later set of parameters, see below. Note, temperLadder overrides temperLimits, ladderLen, scheme and schemeParam.
temperLimits: temperLimits = c(lowerLimit, upperLimit) is a two-tuple of positive numbers, where the lowerLimit is usually 1 and upperLimit is a number in [100, 1000]. If stochastic optimization (via sampling) is the goal, then lowerLimit is taken to be in [0, 1].
ladderLen, scheme and schemeParam: These three parameters are required (along with temperLimits) if temperLadder is not provided. We recommend taking ladderLen in [15, 30]. The allowed choices for scheme and schemeParam are:

`scheme`	`schemeParam`
========	=============
linear	NA
log	NA
geometric	NA
mult-power	NA
add-power	>= 0
reciprocal	NA
exponential	>= 0
tangent	>= 0

We recommended using scheme = 'exponential' and schemeParam in [0.3, 0.5].

cutoffDStats: This cutoff comes from Normal_1(0, 1), the standard normal distribution (Goswami and Liu, 2007); the default value 1.96 is a conservative cutoff. Note if you have more than one statistic in statsFuncList, which is usually the case, using this cutoff may result in different suggested maximum temperatures (as can be seen by calling the print function on the result of findMaxTemper). A conservative recommendation is that you choose the maximum of the suggested temperatures as the final maximum temperature for use in placeTempers and later in parallelTempering or evolMonteCarlo.
cutoffESS: a cutoff for the effective sample size (ESS) of the underlying Markov chain ergodic estimator and the importance sampling estimators.
guideMe: If guideMe = TRUE, then the function suggests different modifications to alter the setting towards a re-run, in case there are problems with the underlying MCMC run.
doFullAnal: If doFullAnal = TRUE, then the search for the maximum temperature is conducted among all the levels of the temperLadder. In case this switch is turned off, the search for maximum temperature is done in a greedy (and faster) manner, namely, search is stopped as soon as all the statistic(s) in the statsFuncList find some maximum temperature(s). Note, the greedy search may result in much higher maximum temperature (and hence sub-optimal) than needed, so it is not recommended.
levelsSaveSampFor: This is passed to evolMonteCarlo for the underlying MCMC run.

This function returns a list with the following components:

`temperLadder`	the temperature ladder used for the underlying MCMC run.
`DStats`	the D-statistic (Goswami and Liu, 2007) values used to find the maximum temperature.
`cutoffDStats`	the `cutoffDStats` argument.
`nIters`	the post burn-in `nIters`.
`levelsSaveSampFor`	the `levelsSaveSampFor` argument.
`draws`	`array` of dimension `nIters` x `sampDim` x `levelsSaveSampForLen`, if `saveFitness = FALSE`. If `saveFitness = TRUE`, then the returned array is of dimension `nIters` x `(sampDim + 1)` x `levelsSaveSampForLen`; i.e., each of the `levelsSaveSampForLen` matrices contain the fitness values in their last column.
`startingVals`	the `startingVals` argument.
`intermediate statistics`	a bunch of intermediate statistics used in the computation of `DStats`, namely, `MCEsts`, `MCVarEsts`, `MCESS`, `ISEsts`, `ISVarEsts`, `ISESS`, each being computed for all the statistics provided by `statsFuncList` argument.
`time`	the time taken by the run.

The effect of leaving the default value NULL for some of the arguments above are as follows:

`temperLadder`	valid `temperLimits`, `ladderLen`, `scheme` and `schemeParam`
	are provided, which are used to construct the `temperLadder`.
`temperLimits`	a valid `temperLadder` is provided.
`levelsSaveSampFor`	`temperLadderLen`.

Gopi Goswami goswami@stat.harvard.edu

Gopi Goswami and Jun S. Liu (2007). On learning strategies for evolutionary Monte Carlo. Statistics and Computing 17:1:23-38.

Gopi Goswami, Jun S. Liu and Wing H. Wong (2007). Evolutionary Monte Carlo Methods for Clustering. Journal of Computational and Graphical Statistics, 16:4:855-876.

placeTempers, evolMonteCarloClustering

## The following example is a simple stochastic optimization problem,
## and thus it does not require any "heating up", and hence the
## maximum temperature turns out to be the coldest one, i.e, 0.5.
adjMatSum <-
    function (xx)
{
    xx     <- as.integer(xx)
    adjMat <- outer(xx, xx, function (id1, id2) { id1 == id2 })
    sum(adjMat)
}
modeSensitive1 <-
    function (xx)
{
    with(partitionRep(xx),
     {
         rr   <- 1 + seq_along(clusterLabels)
         freq <- sapply(clusters, length)
         oo   <- order(freq, decreasing = TRUE)
         sum(sapply(clusters[oo], sum) * log(rr))
     })
}
entropy <-
    function (xx)
{
    yy <- table(as.vector(xx, mode = "numeric"))
    zz <- yy / length(xx)
    -sum(zz * log(zz))
}
maxProp <- 
    function (xx)
{
    yy <- table(as.vector(xx, mode = "numeric"))
    oo <- order(yy, decreasing = TRUE)
    yy[oo][1] / length(xx)
}
statsFuncList  <- list(adjMatSum, modeSensitive1, entropy, maxProp)
KMeansObj      <- KMeansFuncGenerator1(-97531)
maxTemperObj   <-
    with(KMeansObj,
     {
         temperLadder <- c(20, 10, 5, 1, 0.5)
         nLevels      <- length(temperLadder)
         sampDim      <- nrow(yy)
         startingVals <- sample(c(0, 1),
                                size    = nLevels * sampDim,
                                replace = TRUE)
         startingVals <- matrix(startingVals, nrow = nLevels, ncol = sampDim)
         findMaxTemper(nIters            = 50,
                       statsFuncList     = statsFuncList,
                       temperLadder      = temperLadder,
                       startingVals      = startingVals,
                       logTarDensFunc    = logTarDensFunc,
                       levelsSaveSampFor = seq_len(nLevels),
                       doFullAnal        = TRUE,
                       saveFitness       = TRUE,
                       verboseLevel      = 1)
     })
print(maxTemperObj)
print(names(maxTemperObj))
with(c(maxTemperObj, KMeansObj),
 {
     fitnessCol <- ncol(draws[ , , 1])     
     sub        <- paste('uniform prior on # of clusters: DU[',
                         priorMinClusters, ', ',
                         priorMaxClusters, ']', sep = '')
     for (ii in rev(seq_along(levelsSaveSampFor))) {
         main <- paste('EMCC (MAP) clustering (temper = ',
                       round(temperLadder[levelsSaveSampFor[ii]], 3), ')',
                       sep = '')
         MAPRow <- which.min(draws[ , fitnessCol, ii])
         clusterPlot(clusterInd        = draws[MAPRow, -fitnessCol, ii],
                     data              = yy,
                     main              = main,
                     sub               = sub,
                     knownClusterMeans = knownClusterMeans)
     }
 })

EMCC documentation built on May 29, 2017, 1:03 p.m.

EMCC index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

EMCC
Evolutionary Monte Carlo (EMC) Methods for Clustering

findMaxTemper: Find the maximum temperature for parallel MCMC chains
In EMCC: Evolutionary Monte Carlo (EMC) Methods for Clustering

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to findMaxTemper in EMCC...

R Package Documentation

Browse R Packages

We want your feedback!

EMCC Evolutionary Monte Carlo (EMC) Methods for Clustering

findMaxTemper: Find the maximum temperature for parallel MCMC chains In EMCC: Evolutionary Monte Carlo (EMC) Methods for Clustering

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to findMaxTemper in EMCC...

R Package Documentation

Browse R Packages

We want your feedback!

EMCC
Evolutionary Monte Carlo (EMC) Methods for Clustering

findMaxTemper: Find the maximum temperature for parallel MCMC chains
In EMCC: Evolutionary Monte Carlo (EMC) Methods for Clustering