bipartition: Bipartition a sample set
In zdebruine/RcppML: Rcpp Machine Learning Library

bipartition

R Documentation

Bipartition a sample set

Description

Spectral biparitioning by rank-2 matrix factorization

Usage

bipartition(data, tol = 1e-05, nonneg = TRUE, ...)

Arguments

`data`	dense or sparse matrix of features in rows and samples in columns. Prefer `matrix` or `Matrix::dgCMatrix`, respectively
`tol`	tolerance of the fit
`nonneg`	enforce non-negativity of the rank-2 factorization used for bipartitioning
`...`	development parameters

Details

Spectral bipartitioning is a popular subroutine in divisive clustering. The sign of the difference between sample loadings in factors of a rank-2 matrix factorization gives a bipartition that is nearly identical to an SVD.

Rank-2 matrix factorization by alternating least squares is faster than rank-2-truncated SVD (i.e. irlba).

This function is a specialization of rank-2 nmf with support for factorization of only a subset of samples, and with additional calculations on the factorization model relevant to bipartitioning. See nmf for details regarding rank-2 factorization.

Value

A list giving the bipartition and useful statistics:

v : vector giving difference between sample loadings between factors in a rank-2 factorization
dist : relative cosine distance of samples within a cluster to centroids of assigned vs. not-assigned cluster
size1 : number of samples in first cluster (positive loadings in 'v')
size2 : number of samples in second cluster (negative loadings in 'v')
samples1: indices of samples in first cluster
samples2: indices of samples in second cluster
center1 : mean feature loadings across samples in first cluster
center2 : mean feature loadings across samples in second cluster

Advanced Parameters

Several parameters may be specified in the ... argument:

diag = TRUE: scale factors in w and h to sum to 1 by introducing a diagonal, d. This should generally never be set to FALSE. Diagonalization enables symmetry of models in factorization of symmetric matrices, convex L1 regularization, and consistent factor scalings.
samples = 1:ncol(A): samples to include in bipartition, numbered from 1 to ncol(A). Default is all samples.
calc_dist = TRUE: calculate the relative cosine distance of samples within a cluster to either cluster centroid. If TRUE, centers for clusters will also be calculated.
seed = NULL: random seed for model initialization, generally not needed for rank-2 factorizations because robust solutions are recovered when diag = TRUE
maxit = 100: maximum number of alternating updates of w and h. Generally, rank-2 factorizations converge quickly and this should not need to be adjusted.

Author(s)

Zach DeBruine

References

Kuang, D, Park, H. (2013). "Fast rank-2 nonnegative matrix factorization for hierarchical document clustering." Proc. 19th ACM SIGKDD intl. conf. on Knowledge discovery and data mining.

Examples

## Not run: 
library(Matrix)
data(iris)
A <- as(as.matrix(iris[,1:4]), "dgCMatrix")
bipartition(A, calc_dist = TRUE)

## End(Not run)

zdebruine/RcppML documentation built on Sept. 13, 2023, 11:44 p.m.

zdebruine/RcppML index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

zdebruine/RcppML
Rcpp Machine Learning Library

bipartition: Bipartition a sample set
In zdebruine/RcppML: Rcpp Machine Learning Library

Bipartition a sample set

Description

Usage

Arguments

Details

Value

Advanced Parameters

Author(s)

References

See Also

Examples

Related to bipartition in zdebruine/RcppML...

R Package Documentation

Browse R Packages

We want your feedback!

zdebruine/RcppML Rcpp Machine Learning Library

bipartition: Bipartition a sample set In zdebruine/RcppML: Rcpp Machine Learning Library

Bipartition a sample set

Description

Usage

Arguments

Details

Value

Advanced Parameters

Author(s)

References

See Also

Examples

Related to bipartition in zdebruine/RcppML...

R Package Documentation

Browse R Packages

We want your feedback!

zdebruine/RcppML
Rcpp Machine Learning Library

bipartition: Bipartition a sample set
In zdebruine/RcppML: Rcpp Machine Learning Library