cmap_expr: CMap Intensity Signature Database

cmap_exprR Documentation

CMap Intensity Signature Database

Description

To search CMap2 with signatureSearch's correlation based GESS methods (gess_cor), normalized gene expression values (here intensities) are required where the biological replicate information has been collapsed to mean values. For this, the cmap_expr database has been created from CEL files, which are the raw data of the Affymetrix technology. To obtain normalized expression data, the CEL files were downloaded from the CMap project site, and then processed with the MAS5 algorithm. Gene level expression data was generated the same way as described in help file of cmap_rank. Next, the gene expression values for different concentrations and treatment times of each compound and cell were averaged. Subsequently, the expression matrix was saved to an HDF5 file, referred to as the cmap_expr database. It represents mean expression values of 12,403 genes for 1,309 compound treatments in up to 5 cells (3,587 signatures in total).

Details

The cmap_expr database can be downloaded as HDF5 file from Bioconductor’s ExperimentHub as shown in the Example section.

The loaded cmap_expr data object is generated from the raw CEL files downloaded from the CMap project site. For documentation and code of generating the cmap_expr databases from sources, please refer to the vignette of this package by running browseVignettes("signatureSearchData") in R.

References

CMap project site: https://portals.broadinstitute.org/cmap

Examples

library(ExperimentHub)
# eh <- ExperimentHub()
# query(eh, c("signatureSearchData", "cmap_expr"))
# cmap_expr_path <- eh[["EH3224"]]
# rhdf5::h5ls(cmap_expr_path)

yduan004/signatureSearchData documentation built on April 7, 2023, 4:12 a.m.