projects: HCA Project Querying
In Bioconductor/hca: Exploring the Human Cell Atlas Data Coordinating Platform

projects

R Documentation

HCA Project Querying

Description

projects() takes user input to be used to query the HCA API for information about available projects.

projects_facets() summarizes facets and terms used by all records in the projects index.

⁠*_columns()⁠ returns a tibble or named character vector describing the content of the tibble returned by projects(), files(), samples(), or bundles().

projects_detail() takes a unique project_id and catalog for the project, and returns details about the specified project as a list-of-lists

See project_information() and project_title() to easily summarize a project from its project id.

Usage

projects(
  filters = NULL,
  size = 1000L,
  sort = "projectTitle",
  order = c("asc", "desc"),
  catalog = NULL,
  as = c("tibble", "lol", "list", "tibble_expanded"),
  columns = projects_default_columns("character")
)

projects_facets(facet = character(), catalog = NULL)

projects_default_columns(as = c("tibble", "character"))

projects_detail(uuid, catalog = NULL)

Arguments

`filters`	filter object created by `filters()`, or `NULL` (default; all projects).
`size`	integer(1) maximum number of results to return; default: all projects matching `filter`. The default (10000) is meant to be large enough to return all results.
`sort`	character(1) project facet (see `facet_options()`) to sort result; default: `"projectTitle"`.
`order`	character(1) sort order. One of `"asc"` (ascending) or `"desc"` (descending).
`catalog`	character(1) source of data. Use `catalogs()` for possible values.
`as`	character(1) return format. One of `"tibble"` (default), `"lol"`, `"list"`, or `"tibble_expanded"`, as described in the Details and Value sections of `?projects`.
`columns`	named character() indicating the paths to be used for parsing the 'lol' returned from the HCA to a tibble. The names of `columns` are used as column names in the returned tibble. If the columns are unnamed, a name is derived from the elements of `path` by removing `⁠hits[]⁠` and all `⁠[]⁠`, e.g., a path `⁠hits[].donorOrganisms[].biologicalSex[*]⁠` is given the name `donorOrganisms.biologicalSex`.
`facet`	character() of valid facet names. Summary results (see 'Value', below) are returned when missing or length greater than 1; details are returned when a single facet is specified.
`uuid`	character() unique identifier (e.g., `projectId`) of the object.

Details

The as argument determines the object returned by the function. Possible values are:

"tibble" (default) A tibble (data.frame) summarizing essential elements of projects, samples, bundles, or files.
"lol" A 'list-of-lists' representation of the JSON returned by the query as a 'list-of-lists' data structure, indexed and presented to enable convenient filtering, selection, and extraction. See ?lol.
"list" An R list (typically, highly recursive) containing detailed project information, constructed from the JSON response to the original query.
"tibble_expanded" A tibble (data.frame) containing (almost) all information for each project, sample, bundle, or file. The exception is user-contributed matrices present in projects() records; these must be accessed using the "lol" format to extract specific paths as a standard "tibble".

Value

When as = "tibble" or as = "tibble_expanded", a tibble with each row representing an HCA object (project, sample, bundle, or file, depending on the function invoked), and columns summarizing the object. "tibble_expanded" columns contains almost all information about the object, except as noted in the Details section.

When as = "lol", a list-of-lists data structure representing detailed information on each object.

When as = "list", projects() returns an R list, typically containing other lists or atomic vectors, representing detailed information on each project.

projects_facets() invoked with no ⁠facet=⁠ argument returns a tibble summarizing terms available as projects() return values, and for use in filters. The tibble contains columns

facet: the name of the facet.
n_terms: the number of distinct values the facet can take.
n_values: the number of occurrences of the facet term in the entire catalog.

projects_facets() invoked with a scalar value for ⁠facet=⁠ returns a tibble summarizing terms used in the facet, and the number of occurrences of the term in the entire catalog.

⁠*_columns()⁠ returns a tibble with column name containing the column name used in the tibble returned by projects(), files(), samples(), or bundles(), and path the path (see lol_hits()) to the data in the list-of-lists by the same functions when as = "lol". When as = "character", the return value is a named list with paths as elements and abbreviations as names.

list-of-lists containing relevant details about the project.

Examples

projects(filters(), size = 100)

projects_facets()
projects_facets("genusSpecies")

projects_default_columns()

project <- projects(size = 1, as = "list")
project_uuid <- project[["hits"]][[1]][["entryId"]]
projects_detail(uuid = project_uuid)

Bioconductor/hca documentation built on June 11, 2025, 11:47 a.m.