View source: R/fetch_processed_quant.R
fetch_processed_quant | R Documentation |
Fetch alevin-fry processed quantification result of publicly available datasets.
fetch_processed_quant( dataset_ids = c(), fetch_dir = "processed_quant", force = FALSE, delete_tar = FALSE, quiet = FALSE )
dataset_ids |
integer scalar or vector providing the id of the available dataset(s) to be fetched. |
fetch_dir |
path to the directory where the fetched quantification results will be stored. It will be created if not exists. |
force |
logical whether to force re-fetching the existing datasets. |
delete_tar |
logical whether to delete the compressed datasets after
decompressing. If FALSE, the tar files
will be stored in a folder called
quant_tar under the |
quiet |
logical whether to display no messages. |
The raw data for many single-cell and single-nucleus RNA-seq experiments is publicly available. However, certain datasets are used again and again, to demonstrate data processing in tutorials, as benchmark datasets for novel methods (e.g. for clustering, dimensionality reduction, cell type identification , etc.). In particular, 10x Genomics hosts various publicly available datasets generated using their technology and processed via their Cell Ranger software on their website for download.
We have created a Nextflow-based
alevin-fry
workflow that one can use to easily quantify
single-cell RNA-sequencing data in a single workflow. The
pipeline can be found here.
To test out this initial pipeline, we have begun to reprocess the
publicly-available datasets collected from the 10x website. We have
focused the initial effort on standard single-cell and single-nucleus
gene-expression data generated using the Chromium v2 and v3 chemistries,
but hope to expand the pipeline to more complex protocols soon
(e.g. feature barcoding experiments) and process those data as well.
We note that these more complex protocols can already be processed with
alevin-fry
(see the
alevin-fry tutorials),
but these have just not yet been incorporated into the
automated Nextflow-based workflow linked above.
Following we list the name, link and dataset id of the currently available datasets whose quantification result is ready for fetch. To obtain the details of these available datasets as a data frame, simply run 'fetch_processed_quant()' in R.
10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Single Indexed
10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Dual Indexed
PBMCs from Citrate-Treated Cell Preparation Tubes (3' v3.1 Chemistry)
Whole Blood RBC Lysis for PBMCs and Neutrophils, Granulocytes, 3'
Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 5)
Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 1)
Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 5)
Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 1)
Hodgkin's Lymphoma, Dissociated Tumor: Whole Transcriptome Analysis
200 Sorted Cells from Human Glioblastoma Multiforme, 3’ LT v3.1
750 Sorted Cells from Human Invasive Ductal Carcinoma, 3’ LT v3.1
7.5k Sorted Cells from Human Invasive Ductal Carcinoma, 3’ v3.1
Human Glioblastoma Multiforme: 3’v3 Whole Transcriptome Analysis
10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Single Indexed
10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Dual Indexed
Note that because the name of datasets are too long, the stored datasets are named by their id.
If an empty dataset_ids is provided, a data frame containing the information of available datasets will be returned; otherwise, a list of ProcessedQuant class objects, in which each ProcessedQuant object stores the information of one fetched dataset. The 'quant_path' field represents the path to the quantification result of the fetched dataset.
Dongze He
## Not run: library(roe) # run the function available_datasets = load_processed_quant() fetched_quant_list = fetch_processed_quant(dataset_id = c(1, 3), fetch_dir = "processed_quant", force = FALSE, delete_tar = FALSE, quiet = FALSE) print(fetched_quant_list$"1"@quant_path) print(fetched_quant_list$"2"@quant_path) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.