available_projects: List available projects in recount3
In LieberInstitute/recount3: Explore and download data from the recount3 project

available_projects

R Documentation

List available projects in recount3

Description

List available projects in recount3

Usage

available_projects(
  organism = c("human", "mouse"),
  recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3"),
  bfc = recount3_cache(),
  available_homes = project_homes(organism = organism, recount3_url = recount3_url)
)

Arguments

`organism`	A `character(1)` specifying which organism you want to download data from. Supported options are `"human"` or `"mouse"`.
`recount3_url`	A `character(1)` specifying the home URL for `recount3` or a local directory where you have mirrored `recount3`. Defaults to the load balancer http://duffel.rail.bio/recount3, but can also be https://recount-opendata.s3.amazonaws.com/recount3/release from https://registry.opendata.aws/recount/ or SciServer datascope from IDIES at JHU https://sciserver.org/public-data/recount3/data. You can set the R option `recount3_url` (for example in your `.Rprofile`) if you have a favorite mirror.
`bfc`	A BiocFileCache-class object where the files will be cached to, typically created by `recount3_cache()`.
`available_homes`	A `character()` vector with the available project homes for the given `recount3_url`. If you use a non-standard `recount3_url`, you will likely need to specify manually the valid values for `available_homes`.

Value

A data.frame() with the project ID (project), the organism, the file_source from where the data was accessed, the recount3 project home location (project_home), the project project_type that differentiates between data_sources and compilations, the n_samples with the number of samples in the given project.

Examples


## Find all the human projects
human_projects <- available_projects()

## Explore the results
dim(human_projects)
head(human_projects)

## How many are from a data source vs a compilation?
table(human_projects$project_type, useNA = "ifany")

## What are the unique file sources?
table(
    human_projects$file_source[human_projects$project_type == "data_sources"]
)

## Note that big projects are broken up to make them easier to access
## For example, GTEx and TCGA are broken up by tissue
head(subset(human_projects, file_source == "gtex"))
head(subset(human_projects, file_source == "tcga"))

## Find all the mouse projects
mouse_projects <- available_projects(organism = "mouse")

## Explore the results
dim(mouse_projects)
head(mouse_projects)

## How many are from a data source vs a compilation?
table(mouse_projects$project_type, useNA = "ifany")

## What are the unique file sources?
table(
    mouse_projects$file_source[mouse_projects$project_type == "data_sources"]
)

## Not run: 
## Use with a custom recount3_url:
available_projects(
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3test",
    available_homes = "data_sources/sra"
)

## You can also rely on project_homes() if the custom URL has a text file
## that can be read with readLines() at:
## <recount3_url>/<organism>/homes_index
available_projects(
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3test"
)

## End(Not run)

LieberInstitute/recount3 documentation built on June 13, 2025, 6:19 a.m.