knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

Overview

The archs4 package provides utility functions to query and explore the expression profiling data made available through the [ARCHS4 project][archs4web], which is described in the following publication:

[Massive mining of publicly available RNA-seq data from human and mouse][archs4pub].

Because this package requires the user to download a number of data files that are external to the package, the installation instructions are a bit more involved than other R packages, and we leave them for the end of this document.

Usage

After successful installation of this package, you can query the series and samples included in the ARCHS4 repository, as well as materialize the expresion data into well-known bioconductor assay containers for downstream analysis.

To query GEO series and samples, you can use the sample_info function:

library(archs4)

a4 <- Archs4Repository()
ids <- c('GSE89189', 'GSE29943', "GSM1095128", "GSM1095129", "GSM1095130")
sample.info <- sample_info(a4, ids)
head(sample.info)

You can use the as.DGEList function to materialize an edgeR::DGEList from a an arbitrary number of GEO sample and series identifiers. The only restriction is that the data from the series/samples must all be from the same species.

The most often use-case will likely be to create a DGEList for a given study. For instance, the GEO series identifier ["GSE89189"][blurtongeo] refers to the expression data generated to support the [Abud et al. iPSC-Derived Human Microglia-like Cells ...][blurtonpub] paper.

Creating a DGEList from this study will create an object with 27,024 genes across 37 samples in about 1.5 seconds:

yg <- as.DGEList(a4, "GSE89189", feature_type = "gene")

The following command retrieves the 178,135 transcript level counts for this experiment in about 1.5 seconds, as well:

yt <- as.DGEList(a4, "GSE89189", feature_type = "transcript")

Installation





denalitherapeutics/archs4 documentation built on May 17, 2019, 1:29 p.m.