refinebioc

This package provides the bridge between Bioconductor and the vast, homogeneously-processed transcriptomic data from refine.bio.

Cite the refine.bio project as:

Casey S. Greene, Dongbo Hu, Richard W. W. Jones, Stephanie Liu, David S. Mejia, Rob Patro, Stephen R. Piccolo, Ariel Rodriguez Romero, Hirak Sarkar, Candace L. Savonen, Jaclyn N. Taroni, William E. Vauclain, Deepashree Venkatesh Prasad, Kurt G. Wheeler. refine.bio: a resource of uniformly processed publicly available gene expression datasets. URL: https://www.refine.bio

Status

This project is undergoing active development and not meant for general use.

Installation

install.packages("BiocManager")
BiocManager::install("seandavi/RefineBio")

Usage

Available Experiments

RefineBio has a large number of experiments available. You can get a list of all of them with:

library(RefineBio)
experiments <- experiment_listing()
head(experiments)
dim(experiments)

Refine.bio uses multiples sources for data. A simple breakdown can be obtained with:

sort(
  table(gsub("[0-9]", "", experiments$accession_code)),
  decreasing = TRUE
)

Create and download a dataset

An "experiment" for refine.bio is a collection of samples that were processed together. A "dataset" is a collection of experiments that you want to analyze together. You can create a dataset with:

dset <- rb_dataset_request("GSE1133")

This will create a dataset object that you can use to download and analyze the data. The "request" is just that. It does not download or process the data. To get to actual data loaded into R, you need to do the following:

rb_dataset_ensure_started(dset)
rb_wait_for_dataset(dset)
rb_dataset_download(dset)
rb_dataset_extract(dset)
gselist <- rb_dataset_load(dset)

And the resulting list of SummarizedExperiment objects is:

gselist
$GSE1133
class: SummarizedExperiment 
dim: 11868 158 
metadata(13): accession_code description ... technology title
assays(1): exprs
rownames: NULL
rowData names(1): Gene
colnames(158): GSM18865 GSM18866 ... GSM19021 GSM19022
colData names(57): refinebio_accession_code experiment_accession ... title type

Developer checklist

Notes



seandavi/RefineBio documentation built on June 1, 2025, 4:10 p.m.