stops: High Level STOPS Function

View source: R/stops.R

stopsR Documentation

High Level STOPS Function

Description

This allows to fit STOPS models as described in Rusch, Mair, Hornik (2023).

Usage

stops(
  dis,
  loss = "stress",
  theta = 1,
  type = "ratio",
  structures,
  ndim = 2,
  weightmat = NULL,
  init = NULL,
  stressweight = 1,
  strucweight,
  strucpars,
  optimmethod = c("SANN", "ALJ", "pso", "Kriging", "tgp", "DIRECT", "stogo", "cobyla",
    "crs2lm", "isres", "mlsl", "neldermead", "sbplx", "hjk", "cmaes"),
  lower,
  upper,
  verbose = 0,
  stoptype = c("additive", "multiplicative"),
  initpoints = 10,
  itmax = 50,
  itmaxps = 10000,
  model,
  control,
  ...
)

Arguments

dis

numeric matrix or dist object of a matrix of proximities

loss

which loss function to be used for fitting, defaults to stress.

theta

hyperparameter vector starting values for the transformation functions. If the length is smaller than the number of hyperparameters for the MDS version the vector gets recycled (see the corresponding stop_XXX function or the vignette for how theta must look like exactly for each loss). If larger than the number of hyperparameters for the MDS method, an error is thrown. If completely missing theta is set to 1 and recycled.

type

type of MDS optimal scaling (implicit transformation). One of "ratio", "interval" or "ordinal". Default is "ratio". Not every type can be used with every loss, only ratio works with all.

structures

character vector of which c-structuredness indices should be considered; if missing no structure is considered.

ndim

number of dimensions of the target space

weightmat

(optional) a matrix of nonnegative weights; defaults to 1 for all off diagonals

init

(optional) initial configuration

stressweight

weight to be used for the fit measure; defaults to 1

strucweight

vector of weights to be used for the c-structuredness indices (in the same order as in structures); defaults to -1/length(structures) for each index

strucpars

(possibly named with the structure). Metaparameters for the structuredness indices (gamma in the article). It's safest for it be a list of lists with the named arguments for the structuredness indices and the order of the lists must be like the order of structures. So something like this list(list(par1Struc1=par1Struc1,par2Struc1=par2Struc1),list(par1Struc2=par1Struc2,par2Struc2=par2Struc2),...) where parYStrucX are the named arguments for the metaparameter Y of the structure X the list elements corresponds to. For a structure without parameters, set NULL. Parameters in different list elements parYStrucX can have the same name. For example, say we want to use cclusteredness with metaparameters epsilon=10 and k=4 (and the default for the other parameters), cdependence with no metaparameters and cfaithfulness with metaparameter k=7 one would list(list(epsilon=10,k=4),list(NULL),list(dis=obdiss,k=6)) for structures vector ("cclusteredness","cdependence","cfaithfulness"). The parameter lists must be in the same ordering as the indices in structures. If missing it is set to NULL and defaults are used. It is also possible to supply a structure's metaparameters as a list of vectors with named elements if the metaparameters are scalars, so like list(c(par1Struc1=parStruc1,par2Struc1=par1Struc1,...),c(par1Struc2=par1Struc2,par2Struc2=par2Struc2,...)). That can have unintended consequences if the metaparameter is a vector or matrix.

optimmethod

What solver to use. Currently supported are Bayesian optimization with Gaussian Process priors and Kriging ("Kriging"), Bayesian optimization with treed Gaussian processes with jump to linear models ("tgp"), Adaptive LJ Search ("ALJ"), Particle Swarm optimization ("pso"), simulated annealing ("SANN"), "DIRECT", Stochastic Global Optimization ("stogo"), COBYLA ("cobyla"), Controlled Random Search 2 with local mutation ("crs2lm"), Improved Stochastic Ranking Evolution Strategy ("isres"), Multi-Level Single-Linkage ("mlsl"), Nelder-Mead ("neldermead"), Subplex ("sbplx"), Hooke-Jeeves Pattern Search ("hjk"), CMA-ES ("cmaes"). Defaults to "ALJ" version. tgp, ALJ, Kriging and pso usually work well for relatively low values of itmax.

lower

The lower contraints of the search region. Needs to be a numeric vector of the same length as the parameter vector theta.

upper

The upper contraints of the search region. Needs to be a numeric vector of the same length as the parameter vector theta.

verbose

numeric value hat prints information on the fitting process; >2 is very verbose.

stoptype

which aggregation for the multi objective target function? Either 'additive' (default) or 'multiplicative'

initpoints

number of initial points to fit the surrogate model for Bayesian optimization; default is 10.

itmax

maximum number of iterations of the outer optimization (for theta) or number of steps of Bayesian optimization; default is 50. We recommend a higher number for ALJ (around 150). Note that due to the inner workings of some solvers, this may or may not correspond to the actual number of function evaluations performed (or PS models fitted). E.g., with tgp the actual number of function evaluation of the PS method is between itmax and 6*itmax as tgp samples 1-6 candidates from the posterior and uses the best candidate. For pso it is the number of particles s times itmax. For cmaes it is usually a bit higher than itmax. This currently may get overruled by a control argument if it is used (and then set to either ewhat is supplie dby control or to the default of the method).

itmaxps

maximum number of iterations of the inner optimization (to obtain the PS configuration)

model

a character specifying the surrogate model to use. For Kriging it specifies the covariance kernel for the GP prior; see covTensorProduct-class defaults to "powerexp". For tgp it specifies the non stationary process used see bgp, defaults to "btgpllm"

control

a control argument passed to the outer optimization procedure. Will override any other control arguents passed, especially verbose and itmax. For the efect of control, see the functions pomp::sannbox for SANN and pso::psoptim for pso, cmaes::cma_es for cmaes, dfoptim::hjkb for hjk and the nloptr docs for the algorithms DIRECT, stogo, cobyla, crs2lm, isres, mlsl, neldermead, sbplx.

...

additional arguments passed to the outer optimization procedures (not fully tested).

Details

The combination of c-structurednes indices and stress uses the stress.m values, which are the explictly normalized stresses. Reported however is the stress-1 value which is sqrt(stress.m).

Value

A list with the components

  • stoploss: the stoploss value

  • optim: the object returned from the optimization procedure

  • stressweight: the stressweight

  • strucweight: the vector of structure weights

  • call: the call

  • optimmethod: The solver selected

  • loss: The PS badness-of-fit function

  • nobj: the number of objects in the configuration

  • type: The type of stoploss scalacrisation (additive or multiplicative)

  • fit: The fitted PS object (most importantly $fit$conf the fitted configuration)

  • stoptype: Type of stoploss combinatio

Examples

data(kinshipdelta,package="smacof")
strucpar<-list(NULL,NULL) #parameters for indices
res1<-stops(kinshipdelta,loss="stress",
structures=c("cclumpiness","cassociation"),strucpars=strucpar,
lower=0,upper=10,itmax=10)
res1


#use higher itmax in general, we use 5 just to shorten the tests
data(BankingCrisesDistances)
strucpar<-list(c(epsilon=10,minpts=2),NULL) #parameters for indices
res1<-stops(BankingCrisesDistances[,1:69],loss="stress",verbose=0,
structures=c("cclusteredness","clinearity"),strucpars=strucpar,
lower=0,upper=10,itmax=5)
res1

strucpar<-list(list(alpha=0.6,C=15,var.thr=1e-5,zeta=NULL),
list(alpha=0.6,C=15,var.thr=1e-5,zeta=NULL))
res1<-stops(BankingCrisesDistances[,1:69],loss="stress",verbose=0,
structures=c("cfunctionality","ccomplexity"),strucpars=strucpar,
lower=0,upper=10,itmax=5)
res1



stops documentation built on Dec. 12, 2023, 3:02 a.m.