copstressMin: Fitting a COPS-C Model (COPS Variant 1).

copstressMinR Documentation

Fitting a COPS-C Model (COPS Variant 1).

Description

Minimizing Copstress to obtain a clustered MDS configuration with given hyperparameters theta.

Usage

copstressMin(
  delta,
  kappa = 1,
  lambda = 1,
  nu = 1,
  theta = c(kappa, lambda, nu),
  type = c("ratio", "interval", "ordinal"),
  ties = "primary",
  weightmat = 1 - diag(nrow(delta)),
  ndim = 2,
  init = NULL,
  stressweight = 0.975,
  cordweight = 0.025,
  q = 1,
  minpts = ndim + 1,
  epsilon = 10,
  dmax = NULL,
  rang,
  optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp",
    "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes",
    "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"),
  verbose = 0,
  scale = c("sd", "rmsq", "std", "proc", "none"),
  normed = TRUE,
  accuracy = 1e-07,
  itmax = 5000,
  stresstype = c("stress-1", "stress"),
  ...
)

Arguments

delta

numeric matrix or dist object of a matrix of proximities

kappa

power transformation for fitted distances

lambda

power transformation for proximities

nu

power transformation for weights

theta

the theta vector of powers; the first is kappa (for the fitted distances if it exists), the second lambda (for the observed proximities if it exist), the third is nu (for the weights if it exists) . If less than three elements are is given as argument, it will be recycled. Defaults to 1 1 1. Will override any kappa, lmabda, nu parameters if they are given and do not match

type

what type of MDS to fit. Currently one of "ratio", "interval" or "ordinal". Default is "ratio".

ties

the handling of ties for ordinal (nonmetric) MDS. Possible are "primary" (default), "secondary" or "tertiary".

weightmat

(optional) a matrix of nonnegative weights; defaults to 1 for all off diagonals

ndim

number of dimensions of the target space

init

(optional) initial configuration

stressweight

weight to be used for the fit measure; defaults to 0.975

cordweight

weight to be used for the cordillera; defaults to 0.025

q

the norm of the cordillera; defaults to 1

minpts

the minimum points to make up a cluster in OPTICS; defaults to ndim+1

epsilon

the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10

dmax

The winsorization limit of reachability distances in the OPTICS Cordillera. If supplied, it should be either a numeric value that matches max(rang) or NULL; if NULL it is found as 1.5 times (for kappa >1) or 1 times (for kappa <=1) the maximum reachbility value of the power torgerson model with the same lambda. If dmax and rang are supplied and dmax is not max(rang), a warning is given and rang takes precedence.

rang

range of the reachabilities to be considered. If missing it is found from the initial configuration by taking 0 as the lower boundary and dmax (see above) as upper boundary. See also cordillera

optimmethod

What optimizer to use? Choose one string of 'Newuoa' (from package minqa), 'NelderMead', 'hjk' (Hooke-Jeeves algorithm from dfoptim), 'solnl' (from nlcOptim), 'solnp' (from Rsolnp), 'subplex' (from subplex), 'SANN' (simulated annealing), 'BFGS', 'snomadr' (from crs), 'genoud' (from rgenoud), 'gensa' (from GenSA), 'cmaes' (from cmaes) and 'direct' (from nloptr). See the according R packages for details on these solvers. There are also combinations that proved to work well good, like 'hjk-Newuoa', 'hjk-BFGS', 'BFGS-hjk', 'Newuoa-hjk', 'direct-Newuoa' and 'direct-BFGS' . Usually hjk, BFGS, newuoa, subplex and solnl work rather well in an acceptable time frame (depending on the smoothness of copstress). Default is 'hjk-Newuoa'.

verbose

numeric value hat prints information on the fitting process; >2 is very verbose

scale

Allows to scale the configuration for the OC (the scaled configuration is also returned as $conf). One of "none" (so no scaling), "sd" (configuration divided by the highest standard deviation of the columns), "std" (standardize all columns !NOTE: This does not preserve the relative distances of the optimal config), "proc" (procrustes adjustment to the initial fit) and "rmsq" (configuration divided by the maximum root mean square of the columns). Default is "sd".

normed

should the cordillera be normed; defaults to TRUE

accuracy

numerical accuracy, defaults to 1e-7

itmax

maximum number of iterations. Defaults to 5000. If itmax is (too) small, some optimizers will print warnings. For example, for optimizers using NEWUOA, an iteration number of 10*length(par)^2 is recommended. The number of parameters to optimize over for the COPS problem is number of objects * target space dimensions and can grow large very quickly, so being able to live with these warnings is probably a good idea.

stresstype

which stress to use in the copstress. Defaults to stress-1. If anything else is set, explicitly normed stress which is (stress-1)^2. Using stress-1 puts more weight on MDS fit.

...

additional arguments to be passed to the optimization procedure

Value

A list with the components

  • delta: the original transformed dissimilarities

  • obsdiss: the explicitly normed transformed dissimilarities (which are approximated by the fit)

  • confdist: the fitted distances

  • conf: the configuration to which the scaling of argument scale was applied

  • confo: the unscaled but explicitly normed configuration returned from the fitting procedure. Scaling applied to confo gives conf.

  • par, pars : the theta vector of powers tranformations (kappa,lambda,nu)

  • niter: number of iterations of the optimizer.

  • stress: the square root of explicitly normalized stress (calculated for confo).

  • spp: stress per point

  • ndim: number of dimensions

  • model: Fitted model name with optimizer

  • call: the call

  • nobj: the number of objects

  • type, loss, losstype: stresstype

  • stress.m: The stress used for copstress. If stresstype="stress-1" this is like $stress else it is stress^2

  • stress.en: another ways to calculate the stress

  • deltaorig: the original untransformed dissimilarities

  • copstress: the copstress loss value

  • resmat: the matrix of residuals

  • weightmat: the matrix of untransformed weights

  • OC: the (normed) OPTICS Cordillera object (calculated for scaled conf)

  • OCv: the (normed) OPTICS Cordillera value alone (calculated for scaled conf)

  • optim: the object returned from the optimization procedure

  • stressweight, cordweight: the weights of the stress and OC respectively (v_1 and v_2)

  • optimmethod: The solver used

  • type: the type of MDS fitted

Examples

dis<-as.matrix(smacof::kinshipdelta)

#Copstress with equal weight to stress and cordillera 
res1<-copstressMin(dis,stressweight=0.5,cordweight=0.5,
                  itmax=1000) #use higher itmax about 10000 
res1
summary(res1)
plot(res1)  #super clustered


cops documentation built on Jan. 22, 2023, 1:47 a.m.