queryATAC | R Documentation |
This function allows you to search and subset included scATAC-seq datasets. A named list of scATAC-seq_data objects matching the provided options will be returned. Some included datasets are represented using multiple matrices. Each matrix will be a seperate named object within the list. The returned list is named by matrix allow easy identification of data. If queryATAC is called without any options it will retrieve all available datasets in sparse matrix format. This should only be done on machines with a large amount of ram (>64gb) because some datasets are quite large. In most cases it is recommended to instead filter databases with some criteria.
queryATAC(
accession = NULL,
author = NULL,
journal = NULL,
year = NULL,
pmid = NULL,
sequence_tech = NULL,
score_type = NULL,
has_cluster_annotation = NULL,
has_cell_type_annotation = NULL,
organism = NULL,
genome_build = NULL,
broad_cell_category = NULL,
tissue_cell_type = NULL,
disease = NULL,
metadata_only = FALSE,
sparse = TRUE
)
accession |
Search by geo accession number. Good for returning individual datasets |
author |
Search by the author who published the dataset |
journal |
Search by the journal the dataset was published in. |
year |
Search by exact year or year ranges with '<', '>', or '-'. For example, you can return datasets newer than 2013 with '>2013' |
pmid |
Search by Pubmed ID associated with the study. Good for returning individual datasets |
sequence_tech |
Search by sequencing technology used to sample the cells. |
score_type |
Search by type of score (TPM, FPKM, raw count) |
has_cluster_annotation |
Return only those datasets that have clustering results available, or only those without (TRUE/FALSE) |
has_cell_type_annotation |
Return only those datasets that have cell-type annotations available, or only those without annotations (TRUE/FALSE) |
organism |
Search by source organism used in the study, for example human or mouse. |
genome_build |
Return datasets built only using specified genome build (ex. hg19) |
broad_cell_category |
Return datasets based on broad cell categories (ex. Hematopoetic cells). To view all cell categories available, explore the metadata table |
tissue_cell_type |
Return datasets based on tissue or cell types sampled (ex. PBMCs, Bone marrow, Oligodendrocytes) |
disease |
Return datasets based on sampled disease (ex. carcinoma, leukemia, diabetes) |
metadata_only |
Return rows of metadata instead of actual datasets. Useful for exploring what data is available without actually downloading data. Defaults to FALSE |
sparse |
Return expression as a sparse matrix. Reccomended to use sparse format, as dense formats tend to be excessively large. |
A list containing a table of metadata or one or more SingleCellExperiment objects
## Retrieve the metadata table to see what data is available
res <- queryATAC(metadata_only = TRUE)
## Retrieve a single dataset based on its accession number
res <- queryATAC(accession = "GSE129785")
## Retrieve the metadata of datasets between 2016 and 2020
res = queryATAC(year = "2016-2020", metadata_only = TRUE)
## From the table of datasets between 2016 and 2020,
## retrieve the dataset on the third row.
res = queryATAC(year = "2016-2020")[[3]]
## Retrieve a filtered metadata table that only shows mouse
## datasets derived from blood cells with cell type annotations
res_mus <- queryATAC(has_cell_type_annotation = TRUE,
organism = "Mus musculus",
tissue_cell_type = "blood",
metadata_only = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.