cluster_samples: K-means clustering on samples based on latent factors

View source: R/cluster_samples.R

cluster_samplesR Documentation

K-means clustering on samples based on latent factors

Description

MOFA factors are continuous in nature but they can be used to predict discrete clusters of samples.
The clustering can be performed in a single factor, which is equivalent to setting a manual threshold. More interestingly, it can be done using multiple factors, where multiple sources of variation are aggregated.
Importantly, this type of clustering is not weighted and does not take into account the different importance of the latent factors.

Usage

cluster_samples(object, k, factors = "all", ...)

Arguments

object

a trained MOFA object.

k

number of clusters (integer).

factors

character vector with the factor name(s), or numeric vector with the index of the factor(s) to use. Default is 'all'

...

extra arguments passed to kmeans

Details

In some cases, due to model technicalities, samples can have missing values in the latent factor space. In such a case, these samples are currently ignored in the clustering procedure.

Value

output from kmeans function

Examples

# Using an existing trained model on simulated data
file <- system.file("extdata", "model.hdf5", package = "MOFA2")
model <- load_model(file)

# Cluster samples in the factor space using factors 1 to 3 and K=2 clusters 
clusters <- cluster_samples(model, k=2, factors=1:3)

bioFAM/MOFA2 documentation built on June 12, 2024, 3:57 p.m.