mv_splitting: Splitting of hard or soft clusters based on multi-view data

Description Usage Arguments Value

View source: R/Splitting.R

Description

Splitting of hard or soft clusters based on multi-view data

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
mv_splitting(
  X,
  mv,
  clustering_init,
  Kmax,
  gamma = 2,
  use_mv_weights = TRUE,
  delta = 2,
  perCluster_mv_weights = TRUE,
  verbose = TRUE,
  parallel = TRUE,
  BPPARAM = bpparam()
)

Arguments

X

Matrix of multi-view data, where the first view corresponds to the principal data used to obtain the partition or soft clustering in cluster_init

mv

(Optional unless X is a matrix.) If X is a matrix, vector corresponding to the size of each data view. The sum of mv should correspond to the number of columns in X.

clustering_init

Either a vector of available cluster labels (for hard clustering) or a matrix of soft classification labels (for soft clustering)

Kmax

Maximum number of clusters for splitting

gamma

Parameter that controls the distribution of view weights. Default value is 2.

use_mv_weights

If TRUE, run algorithm in weighted multi-view mode; if FALSE, the weight for each view is set to be equal. This option is only used for hard clustering.

delta

Parameter that controls the weights on the soft classifications pi(i,k)

perCluster_mv_weights

If TRUE, use cluster-specific multi-view weights. Otherwise use classic multi-view weights.

verbose

If TRUE, provide verbose output

parallel

If FALSE, no parallelization. If TRUE, parallel execution using BiocParallel (see next argument BPPARAM) for the soft splitting algorithm. A note on running in parallel using BiocParallel: it may be advantageous to remove large, unneeded objects from the current R environment before calling the function, as it is possible that R's internal garbage collection will copy these files while running on worker nodes.

BPPARAM

Optional parameter object passed internally to bplapply when parallel=TRUE. If not specified, the parameters last registered with register will be used.

Value

split_clusters

Matrix providing the history of each cluster splitting at each iteration of the algorithm

weights

Matrix of dimension v x niterations, where v is the number of views and niterations is the number of successive splits, providing the multi-view weights

criterion

Value taken on by the splitting criterion at each iteration

withnss

The within sum-of-squares for each cluster at the last iteration

ksplit

Vector identifying which cluster was split at each iteration of the algorithm

all_probapost

List of conditional probabilities for each split for the soft algorithm


andreamrau/maskmeans documentation built on Nov. 13, 2021, 7:44 a.m.