cluster_samples: K-means clustering on samples based on latent factors

View source: R/cluster_samples.R

cluster_samplesR Documentation

K-means clustering on samples based on latent factors


MOFA factors are continuous in nature but they can be used to predict discrete clusters of samples.
The clustering can be performed in a single factor, which is equivalent to setting a manual threshold. More interestingly, it can be done using multiple factors, where multiple sources of variation are aggregated.
Importantly, this type of clustering is not weighted and does not take into account the different importance of the latent factors.


cluster_samples(object, k, factors = "all", ...)



a trained MOFA object.


number of clusters (integer).


character vector with the factor name(s), or numeric vector with the index of the factor(s) to use. Default is 'all'


extra arguments passed to kmeans


In some cases, due to model technicalities, samples can have missing values in the latent factor space. In such a case, these samples are currently ignored in the clustering procedure.


output from kmeans function


# Using an existing trained model on simulated data
file <- system.file("extdata", "model.hdf5", package = "MOFA2")
model <- load_model(file)

# Cluster samples in the factor space using factors 1 to 3 and K=2 clusters 
clusters <- cluster_samples(model, k=2, factors=1:3)

bioFAM/MOFA2 documentation built on March 21, 2023, 5:27 p.m.