Obtain the damond-pancreas-2019 dataset, which consists of three data objects: single cell data, multichannel images and cell segmentation masks. The data was obtained by imaging mass cytometry of human pancreas sections from donors with type 1 diabetes.
DamondPancreas2019Data( data_type = c("sce", "images", "masks"), metadata = FALSE, on_disk = FALSE, h5FilesPath = NULL, force = FALSE )
type of object to load, should be 'sce' for single cell data, 'images' for multichannel images or 'masks' for cell segmentation masks.
if FALSE (default), the data object selected in
logical indicating if images in form of
HDF5Array objects (as .h5 files) should be stored on disk
rather than in memory. This setting is valid when downloading
path to where the .h5 files for on disk representation
are stored. This path needs to be defined when
logical indicating if images should be overwritten when files with the same name already exist on disk.
This is an Imaging Mass Cytometry (IMC) dataset from Damond et al. (2019), consisting of three data objects:
images contains a hundred 38-channel
images in the form of a CytoImageList class object.
masks contains the cell segmentation
masks associated with the images, in the form of a
CytoImageList class object.
sce contains the single cell data extracted from the
multichannel images using the cell segmentation masks, as well as the
associated metadata, in the form of a SingleCellExperiment.
This represents a total of 252,059 cells x 38 channels.
All data are downloaded from ExperimentHub and cached for local re-use.
Mapping between the three data objects is performed via variables located in
their metadata columns:
mcols() for the CytoImageList
ColData() for the SingleCellExperiment
object. Mapping at the image level can be performed with the
ImageNumber variables. Mapping between cell
segmentation masks and single cell data is performed with the
CellNumber variable, the values of which correspond to the intensity
values of the
DamondPancreas2019_masks object. For practical
examples, please refer to the "Accessing IMC datasets" vignette.
This dataset is a subset of the complete Damond et al. (2019) dataset comprising the data from three pancreas donors at different stages of type 1 diabetes (T1D). The three donors present clearly diverging characteristics in terms of cell type composition and cell-cell interactions, which makes this dataset ideal for benchmarking spatial and neighborhood analysis algorithms.
assay slot of the SingleCellExperiment object
contains two assays:
counts contains mean ion counts per cell.
exprs contains arsinh-transformed counts, with cofactor 1.
The marker-associated metadata, including antibody information and metal tags
are stored in the
rowData of the SingleCellExperiment
The cell-associated metadata are stored in the
colData of the
SingleCellExperiment object. These metadata include cell types
colData(sce)$CellType) and broader cell categories, such as
"immune" or "islet" cells (in
colData(sce)$CellCat). In addition,
for cells located inside pancreatic islets, the islet they belong to is
colData(sce)$ParentIslet. For cells not located in
islets, the "ParentIslet" value is set to 0 but the spatially closest islet
can be identified with
The donor-associated metadata are also stored in the
colData of the
SingleCellExperiment object. For instance, the donors' IDs can
be retrieved with
colData(sce)$case and the donors' disease stage can
be obtained with
The three donors present the following characteristics:
6126 is a non-diabetic donor, with large islets containing
many beta cells, severe infiltration of the exocrine pancreas with
myeloid cells but limited infiltration of islets.
6414 is a donor with recent T1D onset (shortly after
diagnosis) showing partial beta cell destruction and mild infiltration of
islets with T cells.
6180 is a donor with long-duration T1D (11 years after
diagnosis), showing near-total beta cell destruction and limited immune
cell infiltration in both the islets and the pancreas.
`images`: size in memory = 7.40 Gb, size on disk = 1.78 Gb.
`masks`: size in memory = 200 Mb, size on disk = 8.6 Mb.
`sce`: size in memory = 248 Mb, size on disk = 145 Mb.
When storing images on disk, these need to be first fully read into memory before writing them to disk. This means the process of downloading the data is slower than directly keeping them in memory. However, downstream analysis will lose its memory overhead when storing images on disk.
Original source: Damond et al. (2019): https://doi.org/10.1016/j.cmet.2018.11.014
Original link to raw data, also containing the entire dataset: https://data.mendeley.com/datasets/cydmwsfztj/2
A SingleCellExperiment object with single cell data, a CytoImageList object containing multichannel images, or a CytoImageList object containing cell masks.
Damond N et al. (2019). A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry. Cell Metab 29(3), 755-768.
# Load single cell data sce <- DamondPancreas2019Data(data_type = "sce") print(sce) # Display metadata DamondPancreas2019Data(data_type = "sce", metadata = TRUE) # Load masks on disk library(HDF5Array) masks <- DamondPancreas2019Data(data_type = "masks", on_disk = TRUE, h5FilesPath = getHDF5DumpDir()) print(head(masks))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.