Description Usage Arguments Details Value Note Author(s) References Examples
Uses the limited memory BFGS algorithm with bounds from optim
to optimise the weights of a hierarchical clustering with the goal of maximising the cophenetic correlation coefficient.
1 2 3 4 |
data |
the data that needs to be clustered, provided as a dataframe or (numeric) matrix. It is assumed that rows correspond to instances and columns correspond to features. |
start_values |
a vector containing the initial values of the weights. Defaults to |
n_iterate |
the maximum number of iterations used by the quasi-newton method |
clust_method |
a string containing the type of linkage function used by |
bounds |
a vector of size 2 containing the lower and upper bound in position 1 and 2 respectively. The lower bound must not be lower than 0 and not higher than |
minimal_memory_mode |
logical that determines whether the algorithm calculates the differences for each instance and each variables beforehand or calculates them live each time. The first will be chosen when this variable is FALSE,
the second one will be chosen when this variable is TRUE. Note that this requires k vectors of size |
use_cluster |
value that is used only when minimal_memory_mode equals FALSE. If use_cluster is TRUE, it will try to create a FORK type cluster. It will calculate the required k vectors using a FORK cluster of size n-1, where n is the number of logical cores. This does not work on Windows! You can also supply a cluster as made by |
Contrary to intuition, the start_values vector should not be equal to the number of columns in data. It should have one weights less (the weight for the last variable). The reason for this is that we set the sum of all weights to equal a constant (1 in this case), allowing us to not have to set a weight for one variable. This allows us to skip the calculations for that variable, saving some time. The weight for this variable should of course still abide the given bounds!
The result is the output of optim
.
par |
The best set of parameters found. |
value |
The value of fn corresponding to par. |
counts |
A two-element integer vector giving the number of calls to fn and gr respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls to fn to compute a finite-difference approximation to the gradient. |
convergence |
An integer code. 0 indicates successful completion (which is always the case for "SANN" and "Brent"). Possible error codes are noted in the |
This package requires the cluster package. For the parallel part, you will need the standard package parallel (R 2.14.0 and later)
Jeroen van den Hoven
Clustering with optimised weights for Gower's metric: Using hierarchical clustering and Quasi-Newton methods to maximise the cophenetic correlation coefficient, Jeroen van den Hoven.
1 2 3 4 5 6 | ## Basic example
data(faithful)
find_weights(faithful)
## Using custom bounds and other linkage function
find_weights(faithful, bounds = c(0, 1), clust_method = "complete")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.