MoE_control: Set control values for use with MoEClust
In MoEClust: Gaussian Parsimonious Clustering Models with Covariates and a Noise Component

MoE_control

R Documentation

Set control values for use with MoEClust

Description

Supplies a list of arguments (with defaults) for use with MoE_clust.

Usage

MoE_control(init.z = c("hc", "quantile", "kmeans", "mclust", 
                       "random.hard", "soft.random", "list"),
            noise.args = list(...),
            asMclust = FALSE,
            equalPro = FALSE,
            exp.init = list(...),
            algo = c("EM", "CEM", "cemEM"),
            criterion = c("bic", "icl", "aic"),
            stopping = c("aitken", "relative"),
            z.list = NULL, 
            nstarts = 1L,
            eps = .Machine$double.eps,
            tol = c(1e-05, sqrt(.Machine$double.eps), 1e-08),
            itmax = c(.Machine$integer.max, .Machine$integer.max, 1000L),
            hc.args = list(...),
            km.args = list(...),
            posidens = TRUE,
            init.crit = c("bic", "icl"),
            warn.it = 0L,
            MaxNWts = 1000L,
            verbose = interactive(),
            ...)

Arguments

`init.z`	The method used to initialise the cluster labels for the non-noise components. Defaults to `"hc"`, i.e. model-based agglomerative hierarchical clustering tree as per `hc`, for multivariate data (see `hc.args`), or `"quantile"`-based clustering as per `quant_clust` for univariate data (unless there are expert network covariates incorporated via `exp.init$joint` &/or `exp.init$clustMD`, in which case the default is again `"hc"`). The `"quantile"` option is thus only available for univariate data when expert network covariates are not incorporated via `exp.init$joint` &/or `exp.init$clustMD`, or when expert network covariates are not supplied. Other options include `"kmeans"` (see `km.args`), `"random.hard"` or `"soft.random"` initialisation (see `nstarts` below), where `init.z="soft.random"` is only available when `algo != "EM"`, a user-supplied `"list"` (see `z.list` below), and a full run of `Mclust` (itself initialised via a model-based agglomerative hierarchical clustering tree, again see `hc.args`), although this last option `"mclust"` will be coerced to `"hc"` if there are no `gating` &/or `expert` covariates within `MoE_clust` (in order to better reproduce `Mclust` output). When `init.z="list"`, `exp.init$clustMD` is forced to `FALSE`; otherwise, when `exp.init$clustMD=TRUE` and the `clustMD` package is loaded, the `init.z` argument instead governs the method by which a call to `clustMD` is initialised. In this instance, `"quantile"` will instead default to `"hc"`, and the arguments to `hc.args` and `km.args` will be ignored (unless all `clustMD` model types fail for a given number of components). When `init.z="mclust"` or `clustMD` is successfully invoked (via `exp.init$clustMD`), the argument `init.crit` (see below) specifies the model-selection criterion (`"bic"` or `"icl"`) by which the optimal `Mclust` or `clustMD` model type to initialise with is determined, and `criterion` remains unaffected. Finally, when the model includes expert network covariates and `exp.init$mahalanobis=TRUE`, the argument `exp.init$estart` (see below) can be used to modify the behaviour of `init.z="random.hard"` or `init.z="soft.random"` when `nstarts > 1`, toggling between a full run of the EM algorithm for each random initialisation (i.e. `exp.init$estart=FALSE`, the default), or a single run of the EM algorithm starting from the best initial partition obtained among the random starts according to the iterative reallocation initialisation routine (i.e. `exp.init$estart=TRUE`).
`noise.args`	A list supplying select named parameters to control inclusion of a noise component in the estimation of the mixture. If either or both of the arguments `tau0` &/or `noise.init` are supplied, a noise component is added to the the model in the estimation. `tau0` Prior mixing proportion for the noise component. If supplied, a noise component will be added to the model in the estimation, with `tau0` giving the prior probability of belonging to the noise component. Typically supplied as a scalar in the interval (0, 1), e.g. `0.1`, such that the same prior probability of belonging to the noise component applies to all observations. However, `tau0` can also be supplied as a vector (with length equal to the number of observations), which may be particularly useful when gating covariates are present and `noise.args$noise.gate` is `TRUE`. Finally, note that this argument can be supplied instead of or in conjunction with the argument `noise.init` below. `noise.init` A logical or numeric vector indicating an initial guess as to which observations are noise in the data. If numeric, the entries should correspond to row indices of the data. If supplied, a noise component will be added to the model in the estimation. This argument can be used in conjunction with `tau0` above, or can be replaced by that argument also. `noise.gate` A logical indicating whether gating network covariates influence the mixing proportion for the noise component, if any. Defaults to `TRUE`, but leads to greater parsimony if `FALSE`. Only relevant in the presence of a noise component; only effects estimation in the presence of gating covariates. `noise.meth` The method used to estimate the volume when a noise component is invoked. Defaults to `hypvol`. For univariate data, this argument is ignored and the range of the data is used instead (unless `noise.vol` below is specified). The options `"convexhull"` and `"ellipsoidhull"` require loading the geometry and cluster packages, respectively. This argument is only relevant if `noise.vol` below is not supplied. `noise.vol` This argument can be used to override the argument `noise.meth` by specifying the (hyper)volume directly, i.e. specifying an improper uniform density. This will override the use of the range of the response data for univariate data if supplied. Note that the (hyper)volume, rather than its inverse, is supplied here. This can affect prediction and the location of the MVN ellipses for `MoE_gpairs` plots (see `noise_vol`). `equalNoise` Logical which is only invoked when `equalPro=TRUE` and gating covariates are not supplied. Under the default setting (`FALSE`), the mixing proportion for the noise component is estimated, and remaining mixing proportions are equal; when `TRUE` all components, including the noise component, have equal mixing proportions. `discard.noise` A logical governing how the means are summarised in `parameters$mean` and by extension the location of the MVN ellipses in `MoE_gpairs` plots for models with both expert network covariates and a noise component (otherwise this argument is irrelevant). The means for models with expert network covariates are summarised by the posterior mean of the fitted values. By default (`FALSE`), the mean of the noise component is accounted for in the posterior mean. Otherwise, or when the mean of the noise component is unavailable (due to having been manually supplied via `noise.args$noise.vol`), the `z` matrix is renormalised after discarding the column corresponding to the noise component prior to computation of the posterior mean. The renormalisation approach can be forced by specifying `noise.args$discard.noise=TRUE`, even when the mean of the noise component is available. For models with a noise component fitted with `algo="CEM"`, a small extra E-step is conducted for observations assigned to the non-noise components in this case. In particular, the argument `noise.meth` will be ignored for high-dimensional `n <= d` data, in which case the argument `noise.vol` must be specified. Note that this forces `noise.args$discard.noise` to `TRUE`. See `noise_vol` for more details. The arguments `tau0` and `noise.init` can be used separately, to provide alternative means to invoke a noise component. However, they can also be supplied together, in which case observations corresponding to `noise.init` have probability `tau0` (rather than 1) of belonging to the noise component. This strategy also works when `tau0` is supplied as a vector.
`asMclust`	The default values of `stopping` and `hc.args$hcUse` (see below) are such that results for models with no covariates in either network are liable to differ from results for equivalent models obtained via `Mclust`. MoEClust uses `stopping="aitken"` and `hcUse="VARS"` by default, while mclust always implicitly uses `stopping="relative"` and defaults to `hcUse="SVD"`. `asMclust` is a logical variable (`FALSE`, by default) which functions as a simple convenience tool for overriding only these two arguments (even if explicitly supplied!) such that they behave like the function `Mclust`. Other user-specified arguments which differ from mclust are not affected by `asMclust`, as their defaults already correspond to mclust. Results may still differ slightly as MoEClust calculates log-likelihood values with greater precision and may also differ slightly in other numerical aspects which affect parameter estimation or convergence in some cases. Finally, note that `asMclust=TRUE` can be invoked even for models with covariates which are not accommodated by mclust.
`equalPro`	Logical variable indicating whether or not the mixing proportions are to be constrained to be equal in the model. Default: `equalPro = FALSE`. Only relevant when `gating` covariates are not supplied within `MoE_clust`, otherwise ignored. In the presence of a noise component (see `noise.args`), only the mixing proportions for the non-noise components are constrained to be equal (by default, see `equalNoise`), after accounting for the noise component.
`exp.init`	A list supplying select named parameters to control the initialisation routine in the presence of expert network covariates (otherwise ignored): `joint` A logical indicating whether the initial partition is obtained on the joint distribution of the responses & expert network covariates (defaults to `TRUE`) or just the responses (`FALSE`). By default, only continuous expert covariates are considered (see `exp.init$clustMD` below). Only relevant when `init.z` is neither `"random.hard"` nor `"soft.random"` (unless `exp.init$clustMD=TRUE`, in which case `init.z` specifies the initialisation routine for a call to `clustMD`, where `init.z="random.hard"` & `init.z="soft.random"` both map to the `clustMD` argument `startCL="random"`). This will render the `"quantile"` option of `init.z` for univariate data unusable if continuous expert covariates are supplied &/or categorical/ordinal expert covariates are supplied when `exp.init$clustMD=TRUE` and the `clustMD` package is loaded. `mahalanobis` A logical indicating whether to iteratively reallocate observations during the initialisation phase to the component corresponding to the expert network regression to which it's closest to the fitted values of in terms of Mahalanobis distance (defaults to `TRUE`). This will ensure that each component can be well modelled by a single expert prior to running the EM/CEM algorithm. `estart` A logical governing the behaviour of `init.z="random.hard"` or `init.z="soft.random"` when `nstarts > 1` in the presence of expert network covariates. Only relevant when `exp.init$mahalanobis=TRUE`. Defaults to `FALSE`; i.e. all random starts are put through full runs of the EM algorithm. When `TRUE`, all random starts are put through the initial iterative reallocation routine prior to a full run of EM for only the single best random initial partition obtained. See the last set of Examples below. `clustMD` A logical indicating whether categorical/ordinal covariates should be incorporated when using the joint distribution of the response and expert network covariates for initialisation (defaults to `FALSE`). Only relevant when `exp.init$joint=TRUE`. Requires the use of the `clustMD` package. Note that initialising in this manner involves fitting all `clustMD` model types in parallel for all numbers of components considered, and may fail (especially) in the presence of nominal expert network covariates. Unless `init.z="list"`, supplying this argument as `TRUE` when the `clustMD` package is loaded has the effect of superseding the `init.z` argument: this argument now governs instead how the call to `clustMD` is initialised (unless all `clustMD` model types fail for a given number of components, in which case `init.z` is invoked instead to initialise for `G` values for which all `clustMD` model types failed). Similarly, the arguments `hc.args` and `km.args` will be ignored (again, unless all `clustMD` model types fail for a given number of components). `max.init` The maximum number of iterations for the Mahalanobis distance-based reallocation procedure when `exp.init$mahalanobis` is `TRUE`. Defaults to `.Machine$integer.max`. `identity` A logical indicating whether the identity matrix (corresponding to the use of the Euclidean distance) is used in place of the covariance matrix of the residuals (corresponding to the use of the Mahalanobis distance). Defaults to `FALSE` for multivariate response data but defaults to `TRUE` for univariate response data. Setting `identity=TRUE` with multivariate data may be advisable when the dimensions of the data are such that the covariance matrix cannot be inverted (otherwise, the pseudo-inverse is used under the `FALSE` default). `drop.break` When `exp.init$mahalanobis=TRUE` observations will be completely in or out of a component during the initialisation phase. As such, it may occur that constant columns will be present when building a given component's expert regression (particularly for categorical covariates). It may also occur, due to this partitioning, that "unseen" data, when calculating the residuals, will have new factor levels. When `exp.init$drop.break=TRUE`, the Mahalanobis distance based initialisation phase will explicitly fail in either of these scenarios. Otherwise, `drop_constants` and `drop_levels` will be invoked when `exp.init$drop.break` is `FALSE` (the default) to try to remedy the situation. In any case, only a warning that the initialisation step failed will be printed, regardless of the value of `exp.init$drop.break`.
`algo`	Switch controlling whether models are fit using the `"EM"` (the default) or `"CEM"` algorithm. The option `"cemEM"` allows running the EM algorithm starting from convergence of the CEM algorithm.
`criterion`	When either `G` or `modelNames` is a vector, `criterion` determines whether the `"bic"` (Bayesian Information Criterion), `"icl"` (Integrated Complete Likelihood), `"aic"` (Akaike Information Criterion) is used to determine the ‘best’ model when gathering output. Note that all criteria will be returned in any case.
`stopping`	The criterion used to assess convergence of the EM/CEM algorithm. The default (`"aitken"`) uses Aitken's acceleration method via `aitken`, otherwise the `"relative"` change in log-likelihood is monitored (which may be less strict). The `"relative"` option corresponds to the stopping criterion used by `Mclust`: see `asMclust` above. Both stopping rules are ultimately governed by `tol[1]`. When the `"aitken"` method is employed, the asymptotic estimate of the final converged maximised log-likelihood is also returned as `linf` for models with 2 or more components, though the largest element of the returned vector `loglik` still gives the log-likelihood value achieved by the parameters returned at convergence, under both `stopping` methods (see `MoE_clust`).
`z.list`	A user supplied list of initial cluster allocation matrices, with number of rows given by the number of observations, and numbers of columns given by the range of component numbers being considered. In particular, `z.list` must only include columns corresponding to non-noise components when using this method to initialise in the presence of a noise component. Only relevant if `init.z="z.list"`. These matrices are allowed correspond to both soft or hard clusterings, and will be internally normalised so that the rows sum to 1 (or coerced always to a 'hard' matrix if `algo != "EM"`). See `noise.init` and `tau0` above for details on initialisation in the presence of a noise component.
`nstarts`	The number of random initialisations to use when `init.z="random.hard"` or `init.z="soft.random"`. Defaults to `1`. When there are no expert covariates (or when `exp.init$mahalanobis=FALSE` or `exp.init$estart=FALSE`), the results will be based on the random start yielding the highest estimated log-likelihood after each initial partition is subjected to a full run of the EM algorithm. Note, in this case, that all `nstarts` random initialisations are affected by `exp.init$mahalanobis`, if invoked in the presence of expert network covariates, which may remove some of the randomness. Conversely, if `exp.init$mahalanobis=TRUE` and `exp.init$estart=TRUE`, all `nstarts` random starts are put through the initial iterative reallocation routine and only the single best initial partition uncovered is put through the full run of the EM algorithm. See `init.z` and `exp.init$estart` above for more details, though note that `exp.init$mahalanobis=TRUE` and `exp.init$estart=FALSE`, by default.
`eps`	A scalar tolerance associated with deciding when to terminate computations due to computational singularity in covariances. Smaller values of `eps` allow computations to proceed nearer to singularity. The default is the relative machine precision `.Machine$double.eps`, which is approximately 2e-16 on IEEE-compliant machines.
`tol`	A vector of length three giving relative convergence tolerances for 1) the log-likelihood of the EM/CEM algorithm, 2) parameter convergence in the inner loop for models with iterative M-step (`"VEI", "VEE", "EVE", "VVE", "VEV"`), and 3) optimisation in the multinomial logistic regression in the gating network, respectively. The default is `c(1e-05, sqrt(.Machine$double.eps), 1e-08)`. If only one number is supplied, it is used as the tolerance for all three cases given.
`itmax`	A vector of length three giving integer limits on the number of iterations for 1) the EM/CEM algorithm, 2) the inner loop for models with iterative M-step (`"VEI", "VEE", "EVE", "VVE", "VEV"`), and 3) the multinomial logistic regression in the gating network, respectively. The default is `c(.Machine$integer.max, .Machine$integer.max, 1000L)`, allowing termination to be completely governed by `tol[1]` & `tol[2]` for the inner and outer loops of the EM/CEM algorithm. If only one number is supplied, it is used as the iteration limit for the outer loop only and the other elements of `itmax` retain their usual defaults. If, for any model with gating covariates, the multinomial logistic regression in the gating network fails to converge in `itmax[3]` iterations at any stage of the EM/CEM algorithm, an appropriate warning will be printed, prompting the user to modify this argument.
`hc.args`	A list supplying select named parameters to control the initialisation of the cluster allocations when `init.z="hc"` (or when `init.z="mclust"`, which itself relies on `hc`), unless `exp.init$clustMD=TRUE`, the `clustMD` package is loaded, and none of the `clustMD` model types fail (otherwise irrelevant): `hcUse` A string specifying the type of input variables to be used. This defaults to `"VARS"` here, unlike mclust which defaults to `"SVD"`. Other allowable values are documented in `mclust.options`. See `asMclust` above. `hc.meth` A character string indicating the model to be used when hierarchical clustering (see `hc`) is employed for initialisation (either when `init.z="hc"` or `init.z="mclust"`). Defaults to `"EII"` for high-dimensional data, or `"VVV"` otherwise.
`km.args`	A list supplying select named parameters to control the initialisation of the cluster allocations when `init.z="kmeans"`, unless `exp.init$clustMD=TRUE`, the `clustMD` package is loaded, and none of the `clustMD` model types fail (otherwise irrelevant): `kstarts` The number of random initialisations to use. Defaults to `10`. `kiters` The maximum number of K-Means iterations allowed. Defaults to `10`.
`posidens`	A logical governing whether to continue running the algorithm even in the presence of positive log-densities. Defaults to `TRUE`, but setting `posidens=FALSE` can help to safeguard against spurious solutions, which will be instantly terminated if positive log-densities are encountered. Note that versions of MoEClust prior to and including version `1.3.1` always implicitly assumed `posidens=FALSE`.
`init.crit`	The criterion to be used to determine the optimal model type to initialise with, when `init.z="mclust"` or when `exp.init$clustMD=TRUE` and the `clustMD` package is loaded (one of `"bic"` or `"icl"`). Defaults to `"icl"` when `criterion="icl"`, otherwise defaults to `"bic"`. The `criterion` argument remains unaffected.
`warn.it`	A single number giving the iteration count at which a warning will be printed if the EM/CEM algorithm has failed to converge. Defaults to `0`, i.e. no warning (which is true for any `warn.it` value less than `3`), otherwise the message is printed regardless of the value of `verbose`. If non-zero, `warn.it` should be moderately large, but obviously less than `itmax[1]`. A warning will always be printed if one of more models fail to converge in `itmax[1]` iterations.
`MaxNWts`	The maximum allowable number of weights in the call to `multinom` for the multinomial logistic regression in the gating network. There is no intrinsic limit in the code, but increasing `MaxNWts` will probably allow fits that are very slow and time-consuming. It may be necessary to increase `MaxNWts` when categorical concomitant variables with many levels are included or the number of components is high.
`verbose`	Logical indicating whether to print messages pertaining to progress to the screen during fitting. By default is `TRUE` if the session is interactive, and `FALSE` otherwise. If `FALSE`, warnings and error messages will still be printed to the screen, but everything else will be suppressed.
`...`	Catches unused arguments.

Details

MoE_control is provided for assigning values and defaults within MoE_clust and MoE_stepwise.

While the criterion argument controls the choice of the optimal number of components and GPCM/mclust model type, MoE_compare is provided for choosing between fits with different combinations of covariates or different initialisation settings.

Value

A named list in which the names are the names of the arguments and the values are the values supplied to the arguments.

Note

Note that successfully invoking exp.init$clustMD (though it defaults to FALSE) affects the role of the arguments init.z, hc.args, and km.args. Please read the documentation above carefully in this instance.

The initial allocation matrices before and after the invocation of the exp.init related arguments are both stored as attributes in the object returned by MoE_clust (named "Z.init" and "Exp.init", respectively). If init.z="random.hard" or init.z="soft.random" and nstarts > 1, the allocations corresponding to the best random start are stored (regardless of whether exp.init$estart is invoked or not). This can be useful for supplying z.list for future fits.

Author(s)

Keefe Murphy - <keefe.murphy@mu.ie>

Examples

ctrl1 <- MoE_control(criterion="icl", itmax=100, warn.it=15, init.z="random.hard", nstarts=5)

data(CO2data)
GNP   <- CO2data$GNP
res   <- MoE_clust(CO2data$CO2, G=2, expert = ~ GNP, control=ctrl1)

# Alternatively, specify control arguments directly
res2  <- MoE_clust(CO2data$CO2, G=2, expert = ~ GNP, stopping="relative")

# Supplying ctrl1 without naming it as 'control' can throw an error
## Not run: 
res3  <- MoE_clust(CO2data$CO2, G=2, expert = ~ GNP, ctrl1)
## End(Not run)

# Similarly, supplying control arguments via a mix of the ... construct
# and the named argument 'control' also throws an error
## Not run: 
res4  <- MoE_clust(CO2data$CO2, G=2, expert = ~ GNP, control=ctrl1, init.z="kmeans")
## End(Not run)

# Initialise via the mixed-type joint distribution of response & covariates
# Let the ICL criterion determine the optimal clustMD model type
# Constrain the mixing proportions to be equal
ctrl2 <- MoE_control(exp.init=list(clustMD=TRUE), init.crit="icl", equalPro=TRUE)
data(ais)
library(clustMD)
res4  <- MoE_clust(ais[,3:7], G=2, modelNames="EVE", expert= ~ sex,
                   network.data=ais, control=ctrl2)

# Include a noise component by specifying its prior mixing proportion
res5  <- MoE_clust(ais[,3:7], G=2, modelNames="EVE", expert= ~ sex,
                   network.data=ais, tau0=0.1)
                   
# Include a noise component via an initial guess of which observations are noise
mdist <- mahalanobis(ais[,3:7], colMeans(ais[,3:7]), cov(ais[,3:7]))
cutp  <- qchisq(p=0.95, df=ncol(ais[,3:7]))
res6  <- MoE_clust(ais[,3:7], G=2, modelNames="EVE", expert= ~ sex,
                   network.data=ais, noise.init=mdist > cutp)
                   
# Include a noise component by specifying tau0 as a vector
res7  <- MoE_clust(ais[,3:7], G=2, modelNames="EVE", expert= ~ sex,
                   network.data=ais, tau0=pchisq(mdist, df=ncol(ais[,3:7])))                                    
                   
# Investigate the use of random starts
sex  <- ais$sex
# resA uses deterministic starting values (by default) for each G value
 system.time(resA <- MoE_clust(ais[,3:7], G=2, expert=~sex, equalPro=TRUE))
# resB passes each random start through the entire EM algorithm for each G value
 system.time(resB <- MoE_clust(ais[,3:7], G=2, expert=~sex, equalPro=TRUE,
                               init.z="random.hard", nstarts=10))
# resC passes only the "best" random start through the EM algorithm for each G value
# this time, we also use soft rather than hard random starts
 system.time(resC <- MoE_clust(ais[,3:7], G=2, expert=~sex, equalPro=TRUE,
                               init.z="soft.random", nstarts=10, estart=TRUE))
# Here, all three settings (given here in order of speed) identify & converge to the same model
 MoE_compare(resA, resC, resB)

MoEClust documentation built on April 3, 2025, 11:07 p.m.