fetch_processed_quant: Fetch preprocessed quantification result.

View source: R/fetch_processed_quant.R

fetch_processed_quantR Documentation

Fetch preprocessed quantification result.

Description

Fetch alevin-fry processed quantification result of publicly available datasets.

Usage

fetch_processed_quant(
  dataset_ids = c(),
  fetch_dir = "processed_quant",
  force = FALSE,
  delete_tar = FALSE,
  quiet = FALSE
)

Arguments

dataset_ids

integer scalar or vector providing the id of the available dataset(s) to be fetched.

fetch_dir

path to the directory where the fetched quantification results will be stored. It will be created if not exists.

force

logical whether to force re-fetching the existing datasets.

delete_tar

logical whether to delete the compressed datasets after decompressing. If FALSE, the tar files will be stored in a folder called quant_tar under the fetch_dir.

quiet

logical whether to display no messages.

Details

The raw data for many single-cell and single-nucleus RNA-seq experiments is publicly available. However, certain datasets are used again and again, to demonstrate data processing in tutorials, as benchmark datasets for novel methods (e.g. for clustering, dimensionality reduction, cell type identification , etc.). In particular, 10x Genomics hosts various publicly available datasets generated using their technology and processed via their Cell Ranger software on their website for download.

We have created a Nextflow-based alevin-fry workflow that one can use to easily quantify single-cell RNA-sequencing data in a single workflow. The pipeline can be found here. To test out this initial pipeline, we have begun to reprocess the publicly-available datasets collected from the 10x website. We have focused the initial effort on standard single-cell and single-nucleus gene-expression data generated using the Chromium v2 and v3 chemistries, but hope to expand the pipeline to more complex protocols soon (e.g. feature barcoding experiments) and process those data as well. We note that these more complex protocols can already be processed with alevin-fry (see the alevin-fry tutorials), but these have just not yet been incorporated into the automated Nextflow-based workflow linked above.

Following we list the name, link and dataset id of the currently available datasets whose quantification result is ready for fetch. To obtain the details of these available datasets as a data frame, simply run 'fetch_processed_quant()' in R.

  1. 500 Human PBMCs, 3' LT v3.1, Chromium Controller

  2. 500 Human PBMCs, 3' LT v3.1, Chromium X

  3. 1k PBMCs from a Healthy Donor (v3 chemistry)

  4. 10k PBMCs from a Healthy Donor (v3 chemistry)

  5. 10k Human PBMCs, 3' v3.1, Chromium X

  6. 10k Human PBMCs, 3' v3.1, Chromium Controller

  7. 10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Single Indexed

  8. 10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Dual Indexed

  9. 20k Human PBMCs, 3' HT v3.1, Chromium X

  10. PBMCs from EDTA-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry)

  11. PBMCs from Heparin-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry)

  12. PBMCs from ACD-A Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry)

  13. PBMCs from Citrate-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry)

  14. PBMCs from Citrate-Treated Cell Preparation Tubes (3' v3.1 Chemistry)

  15. PBMCs from a Healthy Donor: Whole Transcriptome Analysis

  16. Whole Blood RBC Lysis for PBMCs and Neutrophils, Granulocytes, 3'

  17. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 5)

  18. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 1)

  19. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 5)

  20. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 1)

  21. Hodgkin's Lymphoma, Dissociated Tumor: Whole Transcriptome Analysis

  22. 200 Sorted Cells from Human Glioblastoma Multiforme, 3’ LT v3.1

  23. 750 Sorted Cells from Human Invasive Ductal Carcinoma, 3’ LT v3.1

  24. 2k Sorted Cells from Human Glioblastoma Multiforme, 3’ v3.1

  25. 7.5k Sorted Cells from Human Invasive Ductal Carcinoma, 3’ v3.1

  26. Human Glioblastoma Multiforme: 3’v3 Whole Transcriptome Analysis

  27. 1k Brain Cells from an E18 Mouse (v3 chemistry)

  28. 10k Brain Cells from an E18 Mouse (v3 chemistry)

  29. 1k Heart Cells from an E18 mouse (v3 chemistry)

  30. 10k Heart Cells from an E18 mouse (v3 chemistry)

  31. 10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Single Indexed

  32. 10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Dual Indexed

  33. 1k PBMCs from a Healthy Donor (v2 chemistry)

  34. 1k Brain Cells from an E18 Mouse (v2 chemistry)

  35. 1k Heart Cells from an E18 mouse (v2 chemistry)

Note that because the name of datasets are too long, the stored datasets are named by their id.

Value

If an empty dataset_ids is provided, a data frame containing the information of available datasets will be returned; otherwise, a list of ProcessedQuant class objects, in which each ProcessedQuant object stores the information of one fetched dataset. The 'quant_path' field represents the path to the quantification result of the fetched dataset.

Author(s)

Dongze He

Examples


## Not run: 
library(roe)
# run the function
available_datasets = load_processed_quant()
fetched_quant_list = fetch_processed_quant(dataset_id = c(1, 3),
                                              fetch_dir = "processed_quant",
                                              force = FALSE,
                                              delete_tar = FALSE,
                                              quiet = FALSE)

print(fetched_quant_list$"1"@quant_path)
print(fetched_quant_list$"2"@quant_path)

## End(Not run)


COMBINE-lab/roe documentation built on Nov. 8, 2022, 5:23 p.m.