GenotypeMixtures is a handy package that builds on the souporcell package (Heaton et al. 2020), to stitch together genotypes across multiple single cell genomics experiments with an overlapping mixture experimental design...
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Install GenotypeMixtures from github. Requires devtools.
#devtools::install_github("bjstewart1/GenotypeMixtures")
Load GenotypeMixtures
library(GenotypeMixtures)
Experimental designs can be read in using this function if you point at a .csv Alternatively you can read in the .csv however you like, or construct from another file The experiments (10X channels (mixtures) should be rows, and the donors/genotypes should be columns. Membership is denoted by 1 vs 0.
exp_design_path = system.file("extdata", "experimental_design.csv", package = "GenotypeMixtures") experimental_design <- read_experimental_design(experimental_design_path = exp_design_path) plot_experimental_design(experimental_design)
We can also read in the locations of the souporcell directories The first column should be the mixture name, the second column should be the path to the soup or cell directory There is some built in dummy vcf files in the package for this vignette
file_locations <- data.frame("channel" = rownames(experimental_design), "SOC_directory" = file.path(system.file("extdata", package = "GenotypeMixtures"), rownames(experimental_design) )) head(file_locations)
Now we plug this into the main function which constructs a genotype cluster graph
genotype_clustering_output <- construct_genotype_cluster_graph(experimental_design = experimental_design, file_locations = file_locations )
Now we can plot the graph which stitches together the genotypes
genotype_clustering_output$graph_plot
Now we can plot the membership matrix which tells us which of our genotypes belongs to which mixtures
genotype_clustering_output$membership_plot
Now we can plot genotype VAFs - this is a useful diagnostic plot; matching genotypes should have their variants along the diagnonal. This is synthetic data, but real data should look reasonably similar to this
plot_cross_vaf(experiment_1_path = file_locations[2, 2], experiment_2_path = file_locations[3,2], experiment_1_name = file_locations[2,1], experiment_2_name = file_locations[3,1])
We can map these computed genotypes back to the original genotypes in our experimental design
cluster_mapping <- membership_map(experimental_design = experimental_design, graph_output = genotype_clustering_output) tail(cluster_mapping)
Finally we can assign single cells across our experiments to genotype - feed the output of membership_map() to cells_to_genotypes The output of this can be easily added to the metadata of your single cell experiment/seurat/anndata object
cell_assignments <- cells_to_genotypes(SOC_locations = file_locations, membership_mat =cluster_mapping) tail(cell_assignments)
The package can also output an experimental design with varying levels of density
dense_design <- make_overlapping_mixture(n_mixtures = 12, n_genotypes = 7, density = 1 ) medium_density_design <- make_overlapping_mixture(n_mixtures = 12, n_genotypes = 7, density = 0.5 ) sparse_design <- make_overlapping_mixture(n_mixtures = 12, n_genotypes = 7, density = 0 )
This is a dense design
plot_experimental_design(dense_design)
This is a medium design
plot_experimental_design(medium_density_design)
This is a sparse design
plot_experimental_design(sparse_design)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.