knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(tiledbsc) library(SeuratObject)
While tiledbsc facilitates the storage and retrieval of complete Seurat/Bioconductor datasets, it's also possible to selectively update specific components of an existing SOMA
.
As in the Introduction we'll use the pbmc_small
dataset from the SeuratObject package.
data("pbmc_small", package = "SeuratObject") pbmc_small
We'll start by creating a Seurat Assay
object that contains only raw RNA counts.
pbmc_small_rna <- CreateAssayObject( counts = GetAssayData(pbmc_small[["RNA"]], "counts") ) Key(pbmc_small_rna) <- Key(pbmc_small[["RNA"]])
and then ingest this into a new SOMA
on disk:
soma <- SOMA$new(uri = file.path(tempdir(), "pbmc_small_rna")) soma$from_seurat_assay(pbmc_small_rna) soma
This performed the following operations:
X
, obsm
, varm
, obsp
, varp
, and misc
obs
, with a single dimension containing all cell names and 0 attributes (because Seurat Assay
objects do not contain cell-level metadata)var
, with a single dimension containing all feature names and 0 attributes (because the feature-level metadata was empty)counts
within the X
group and ingested all data from pbmc_small_rna
's counts
slotPrinting out the X
group, we can see it contains only a single member, counts
.
soma$X
Note: A Seurat Assay
instantiated with data passed to the counts
argument will store a reference to the same matrix in both the counts
and data
slots (i.e., GetAssayData(pbmc_small_rna, "counts")
and GetAssayData(pbmc_small_rna, "data")
). To avoid redundantly storing data on disk, tiledbsc only ingests data from data
when it is not identical to counts
.
Now let's populate pbmc_small_rna
's counts
slot with normalized counts.
pbmc_small_rna <- SetAssayData( pbmc_small_rna, slot = "data", new.data = GetAssayData(pbmc_small[["RNA"]], "data") ) pbmc_small_rna
At this point, counts
and data
contain different transformations of the same matrix but only the counts
data has been written to disk.
We could simply rerun soma$from_seurat_assay(pbmc_small_rna)
to update the SOMA
and ingest the normalized counts matrix into X/data
, however, this
would perform another full conversion of the entire Assay
object. As a result, data from the counts
slot would be re-ingested into X/counts
and feature-level metadata would be re-ingested into var
.
To avoid these redudndant ingestions, use the layers
argument to specify which of the assay slots (counts
, data
, or scale.data
) should be ingested. We also set var = FALSE
to prevent re-ingestion of the feature-level metadata since it hasn't changed.
soma$from_seurat_assay(pbmc_small_rna, layers = "data", var = FALSE)
Printing out the X
group again reveals that it now contains 2 members: counts
and data
.
soma$X
Let's repeat this process for the scaled/centered version of the data that's stored in the scale.data
slot.
pbmc_small_rna <- SetAssayData( pbmc_small_rna, slot = "scale.data", new.data = GetAssayData(pbmc_small[["RNA"]], "scale.data") ) soma$from_seurat_assay(pbmc_small_rna, layers = "scale.data", var = FALSE) soma$X
The X
group now contains 3 arrays, each of which has only been written to a single time. We can verify this by counting the number of fragments contained within any of individual arrays:
soma$X$members$count$fragment_count()
TileDB creates a new fragment each time a write is performed, so 1 fragment confirms the array has been written to only once.
If the data in your Seurat Assay
has changed since it was ingested into TileDB, you can follow the same basic process to update the array.
Here, we'll aritifically shift all of the values in the scale.data
slot by 1 and then re-ingest the updated data into the scale.data
layer on disk:
pbmc_small_rna <- SetAssayData( pbmc_small_rna, slot = "scale.data", new.data = GetAssayData(pbmc_small[["RNA"]], "scale.data") + 1 ) soma$from_seurat_assay(pbmc_small_rna, layers = "scale.data", var = FALSE)
Now let's verify that number of fragments in each X
's layers:
sapply( names(soma$X$members), function(x) soma$X$members[[x]]$fragment_count() )
Session Info
sessionInfo()
unlink(soma$uri, recursive = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.