available_samples: List available samples in recount3

View source: R/available_projects.R

available_samplesR Documentation

List available samples in recount3

Description

This function returns a data.frame() with the samples that are available from recount3. Note that a specific sample might be available from a given data_source and none or many collections.

Usage

available_samples(
  organism = c("human", "mouse"),
  recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3"),
  bfc = recount3_cache(),
  verbose = getOption("recount3_verbose", TRUE),
  available_homes = project_homes(organism = organism, recount3_url = recount3_url)
)

Arguments

organism

A character(1) specifying which organism you want to download data from. Supported options are "human" or "mouse".

recount3_url

A character(1) specifying the home URL for recount3 or a local directory where you have mirrored recount3. Defaults to the load balancer http://duffel.rail.bio/recount3, but can also be https://recount-opendata.s3.amazonaws.com/recount3/release from https://registry.opendata.aws/recount/ or SciServer datascope from IDIES at JHU https://sciserver.org/public-data/recount3/data. You can set the R option recount3_url (for example in your .Rprofile) if you have a favorite mirror.

bfc

A BiocFileCache-class object where the files will be cached to, typically created by recount3_cache().

verbose

A logical(1) indicating whether to show messages with updates.

available_homes

A character() vector with the available project homes for the given recount3_url. If you use a non-standard recount3_url, you will likely need to specify manually the valid values for available_homes.

Value

A data.frame() with the sample ID used by the original source of the data (external_id), the project ID (project), the organism, the file_source from where the data was accessed, the date the sample was processed (date_processed) in YYYY-MM-DD format, the recount3 project home location (project_home), and the project project_type that differentiates between data_sources and compilations.

Examples


## Find all the human samples available from recount3
human_samples <- available_samples()
dim(human_samples)
head(human_samples)

## How many are from a data source vs a compilation?
table(human_samples$project_type, useNA = "ifany")

## What are the unique file sources?
table(
    human_samples$file_source[human_samples$project_type == "data_sources"]
)

## Find all the mouse samples available from recount3
mouse_samples <- available_samples("mouse")
dim(mouse_samples)
head(mouse_samples)

## How many are from a data source vs a compilation?
table(mouse_samples$project_type, useNA = "ifany")

LieberInstitute/recount3 documentation built on May 4, 2024, 4:16 a.m.