options(rmarkdown.html_vignette.check_title = FALSE) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE )
This demonstrates how to generate and inspect model summaries. Summarising models fitted to both the high-dimensional space and its corresponding 2-D embedding is an essential step in evaluating how well a low-dimensional representation captures the structure of the original data.
library(quollr) library(dplyr) library(ggplot2)
Begin by fitting a high-dimensional model and its corresponding 2-D model using the fit_highd_model()
function. This generates the 2-D bin centroids (the 2-D model) and their corresponding coordinates in the high-dimensional space (the lifted model).
model <- fit_highd_model( highd_data = scurve, nldr_data = scurve_umap, b1 = 4, q = 0.1, benchmark_highdens = 5 ) df_bin_centroids <- model$model_2d df_bin <- model$model_highd
To evaluate model fit, you can predict the 2-D embedding for each observation in the original high-dimensional dataset.
pred_df_training <- predict_emb( highd_data = scurve, model_highd = scurve_model_obj$model_highd, model_2d = scurve_model_obj$model_2d ) glimpse(pred_df_training)
The plot below shows the original UMAP embedding of the training data in grey, overlaid with the predicted 2-D coordinates in red.
umap_scaled <- scurve_model_obj$nldr_obj$scaled_nldr umap_scaled |> ggplot(aes(x = emb1, y = emb2, label = ID)) + geom_point(alpha = 0.5) + geom_point(data = pred_df_training, aes(x = pred_emb_1, y = pred_emb_2), color = "red", alpha = 0.5) + coord_equal() + theme( plot.title = element_text(hjust = 0.5, size = 18, face = "bold"), axis.text = element_text(size = 5), axis.title = element_text(size = 7) )
Use the glance()
function to compute summary statistics that describe how well the 2-D model captures structure in the high-dimensional space.
glance( highd_data = scurve, model_highd = scurve_model_obj$model_highd, model_2d = scurve_model_obj$model_2d )
To obtain a detailed data frame that includes the high-dimensional observations, their assigned bins, predicted embeddings, and summary metrics, use the augment()
function:
augment( highd_data = scurve, model_highd = scurve_model_obj$model_highd, model_2d = scurve_model_obj$model_2d ) |> head(5)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.