library('BiocStyle') knitr::opts_chunk$set( autodep=TRUE, cache=FALSE, dev='png' )
library('SampleQC') library('patchwork')
# define what to use and annotate group_list = metadata(qc_obj)$group_list n_groups = metadata(qc_obj)$n_groups
Maximum mean discrepancy (MMD) is a measure of dissimilarity of empirical (possibly multivariate) distributions. If $X$ and $Y$ are sampled from distributions $D_x$ and $D_y$, then $E(MMD(X,Y)) = 0$ if and only if $D_x = D_y$. SampleQC
uses MMD to estimate similarities between the QC matrices of samples in a experiment. Viewed as equivalent to a distance, SampleQC
uses the MMD values as input to multiple non-linear embedding approaches, and for clustering. This then allows users to identify possible batch effects in the samples, and groupings of samples which have similar distributions of QC metrics.
Heatmap of all pairwise dissimilarities between samples (values close to 0 indicate similar samples; values of 1 and higher indicate extremely dissimilar samples).
(plot_mmd_heatmap(qc_obj))
UMAP embedding of dissimilarity matrix, annotated with selected discrete and continuous values for each sample.
plot_embeddings(qc_obj, "discrete", "UMAP") plot_embeddings(qc_obj, "continuous", "UMAP")
Multidimensional scaling (MDS) embedding of dissimilarity matrix, annotated with selected discrete and continuous values for each sample.
plot_embeddings(qc_obj, "discrete", "MDS") plot_embeddings(qc_obj, "continuous", "MDS")
SampleQC
model fits and outliers over QC biaxialsThese plots show biaxial distributions of each sample, annotated with both the fitted mean and covariance matrices, and the cells which are then identified as outliers. You can use this to check that you have the correct number of components for each sample grouping, and to check that the fitting procedure has worked properly. The means and covariances of the components should match up to the densest parts of the biaxial plots.
for (g in group_list) { cat('## ', g, '{.tabset}\n') # which samples? samples_g = sort(colData(qc_obj)$sample_id[ colData(qc_obj)$group_id == g ]) for (s in samples_g) { cat('### ', s, ' \n') g_fit = plot_fit_over_biaxials(qc_obj, s) g_out = plot_outliers(qc_obj, s) g = g_fit / g_out print(g) cat('\n\n') } }
These plots show the fitted parameters for each sample and each mixture component. There are two sets of parameters: $\alpha_j$, the mean shift for each sample; and $(\mu_k, \Sigma_k)$, the relative means and covariances for each mixture component.
Values of $\alpha_j$ which are extreme relative to those for most other samples indicate samples which are either outliers in terms of their QC statistics, or have been allocated to the wrong sample grouping.
# plot likelihoods for (g in group_list) { cat('### ', g, '\n') print(plot_alpha_js_likelihoods(qc_obj, g)) cat('\n\n') }
These plots show the same $\alpha_j$ values, but as biaxials, and equivalently for PCA projections.
for (g in group_list) { cat('### ', g, ' feats\n') print(plot_alpha_js(qc_obj, g, qc_idx=1:2, pc_idx=1:2)) cat('\n\n') cat('### ', g, ' mito\n') print(plot_alpha_js(qc_obj, g, qc_idx=c(1,3), pc_idx=c(1,3))) cat('\n\n') }
These plots show the composition of each sample in terms of the $K$ different mixture components, plus outliers.
# plot likelihoods for (g in group_list) { cat('### ', g, '\n') print(plot_beta_ks(qc_obj, g)) cat('\n\n') }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.