#' Obtain human bulk RNA-seq data from Blueprint and ENCODE
#'
#' Download and cache the normalized expression values of 259 RNA-seq samples of
#' pure stroma and immune cells as generated and supplied by Blueprint and ENCODE.
#'
#' @inheritParams HumanPrimaryCellAtlasData
#' @param rm.NA String specifying how missing values should be handled.
#' \code{"rows"} will remove genes with at least one missing value,
#' \code{"cols"} will remove samples with at least one missing value,
#' \code{"both"} will remove any gene or sample with at least one missing value,
#' and \code{"none"} will not perform any removal.
#'
#' @details
#' This function provides normalized expression values for 259 bulk RNA-seq samples
#' generated by Blueprint and ENCODE from pure populations of stroma and immune
#' cells (Martens and Stunnenberg, 2013; The ENCODE Consortium, 2012).
#' The samples were processed and normalized as described in Aran, Looney and
#' Liu et al. (2019), i.e., the raw RNA-seq counts were downloaded from
#' Blueprint and ENCODE in 2016 and normalized via edgeR (TPMs).
#'
#' Blueprint Epigenomics contains 144 RNA-seq pure immune samples annotated to 28 cell types.
#' ENCODE contains 115 RNA-seq pure stroma and immune samples annotated to 17 cell types.
#' All together, this reference contains 259 samples with 43 cell types (\code{"label.fine"}),
#' manually aggregated into 24 broad classes (\code{"label.main"}).
#' The fine labels have also been mapped to the Cell Ontology (\code{"label.ont"},
#' if \code{cell.ont} is not \code{"none"}), which can be used for further programmatic
#' queries.
#'
#' @return A \linkS4class{SummarizedExperiment} object with a \code{"logcounts"} assay
#' containing the log-normalized expression values, along with cell type labels in the
#' \code{\link{colData}}.
#'
#' @author Friederike Dündar
#'
#' @references
#' The ENCODE Project Consortium (2012).
#' An integrated encyclopedia of DNA elements in the human genome.
#' \emph{Nature} 489, pages 57–74.
#'
#' Martens JHA and Stunnenberg HG (2013).
#' BLUEPRINT: mapping human blood cell epigenomes.
#' \emph{Haematologica} 98, 1487–1489.
#'
#' Aran D, Looney AP, Liu L et al. (2019).
#' Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage.
#' \emph{Nat. Immunol.} 20, 163–172.
#'
#' @examples
#' ref.se <- BlueprintEncodeData(rm.NA = "rows")
#'
#' @export
BlueprintEncodeData <- function(rm.NA = c("rows","cols","both","none"), ensembl=FALSE, cell.ont=c("all", "nonna", "none"), legacy=FALSE) {
rm.NA <- match.arg(rm.NA)
cell.ont <- match.arg(cell.ont)
if (!legacy && rm.NA == "rows" && cell.ont == "all") {
se <- fetchReference("blueprint_encode", "2024-02-26", realize.assays=TRUE)
} else {
se <- .create_se("blueprint_encode",
version = list(logcounts="1.0.0", coldata="1.2.0"),
assays="logcounts", rm.NA = rm.NA,
has.rowdata = FALSE, has.coldata = TRUE)
se <- .add_ontology(se, "blueprint_encode", cell.ont)
}
.convert_to_ensembl(se, "Hs", ensembl)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.