convex_clustering: Compute Convex Clustering Solution Path on a User-Specified...

Description Usage Arguments Details Value Examples

View source: R/solvers.R

Description

convex_clustering calculates the convex clustering solution path at a user-specified grid of lambda values (or just a single value). It is, in general, difficult to know a useful set of lambda values a priori, so this function is more useful for timing comparisons and methodological research than applied work.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
convex_clustering(
  X,
  ...,
  lambda_grid,
  weights = sparse_rbf_kernel_weights(k = "auto", phi = "auto", dist.method =
    "euclidean", p = 2),
  X.center = TRUE,
  X.scale = FALSE,
  norm = 2,
  impute_func = function(X) {     if (anyNA(X))          missForest(X)$ximp     else X
    },
  status = (interactive() && (clustRviz_logger_level() %in% c("MESSAGE", "WARNING",
    "ERROR")))
)

Arguments

X

The data matrix (X): rows correspond to the observations (to be clustered) and columns to the variables (which will not be clustered). If X has missing values - NA or NaN values - they will be automatically imputed.

...

Unused arguements. An error will be thrown if any unrecognized arguments as given. All arguments other than X must be given by name.

lambda_grid

A user-supplied set of lambda values at which to solve the convex clustering problem. These must be strictly positive values and will be automatically sorted internally.

weights

One of the following:

  • A function which, when called with argument X, returns an b-by-n matrix of fusion weights.

  • A matrix of size n-by-n containing fusion weights

X.center

A logical: Should X be centered columnwise?

X.scale

A logical: Should X be scaled columnwise?

norm

Which norm to use in the fusion penalty? Currently only 1 and 2 (default) are supported.

impute_func

A function used to impute missing data in X. By default, the missForest function from the package of the same name is used. This provides a flexible potentially non-linear imputation function. This function has to return a data matrix with no NA values. Note that, consistent with base R, both NaN and NA are treaded as "missing values" for imputation.

status

Should a status message be printed to the console?

Details

Compared to the CARP function, the returned object is much more "bare-bones," containing only the estimated U matrices, and no information used for dendrogram or path visualizations.

Value

An object of class convex_clustering containing the following elements (among others):

Examples

1
2
clustering_fit <- convex_clustering(presidential_speech[1:10,1:4], lambda_grid = 1:100)
print(clustering_fit)

jjn13/clustRviz documentation built on Sept. 1, 2020, 7:53 a.m.