loadCMap: Load Connectivity Map project data (1st and 2nd phase, Level...

Description Usage Arguments Details Value Author(s)

View source: R/loadCMap.R

Description

Download Connectivity Map project data to a specified directory. !! ATTENTION !! This is a 5-20 GB download. Details: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70138, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92742. A description of the latest file, and a table listing the contents of the 'Broad_LINCS_auxiliary_datasets.tar.gz' file are updated in the following document: https://docs.google.com/document/d/1q2gciWRhVCAAnlvF2iRLuJ7whrGP6QjpsCMq1yWz7dU/edit#heading=h.l6bq0r1aih50

Unzip Connectivity Map data

writeLandMarkOnlyFile() generates and writes .gctx file containing only landmark genes from corresponding Connectivity Map release. This requires about 32 GB of RAM.

Usage

1
2
3
4
5
6
7
loadCMap(directory = getwd(), level = 5, phase = 1,
  landmark_only = T)

unzipCMapData(CMap_files)

writeLandMarkOnlyFile(CMap_files = loadCMap(directory = "./data/cmap/",
  level = 4, phase = 1))

Arguments

directory

dir where to save Connectivity Map project data

CMap_files

a list of directories and urls produced by loadCMap

CMap_files

a list of directories and urls produced by loadCMap

Details

LINCS aims to enable a functional understanding of biology by cataloging changes in gene expression and other cellular processes that occur when cells are exposed to a variety of perturbing agents. The Broad Institute LINCS Center for Transcriptomics contributes to this collaborative effort by application of the Connectivity Map concept. In brief, the study design involves the generation of a compendium of transcriptional expression data from cultured human cells treated with small-molecule and genetic loss/gain of function perturbagens. These measurements are made using the L1000 high-throughput gene-expression assay that enables data generation at an unprecedented scale. The data are processed through a computational system, that converts raw fluorescence intensities into differential gene expression signatures. The data at each stage of the pre-processing are available:

Level 1 (LXB) - raw, unprocessed flow cytometry data from Luminex scanners. One LXB file is generated for each well of a 384-well plate, and each file contains a fluorescence intensity value for every observed analyte in the well.

Level 2 (GEX) - gene expression values per 1,000 genes after deconvolution from Luminex beads.

Level 3 (Q2NORM) - gene expression profiles of both directly measured landmark transcripts plus inferred genes. Normalized using invariant set scaling followed by quantile normalization.

Level 4 (Z-SCORES) - signatures with differentially expressed genes computed by robust z-scores for each profile relative to control (PC relative to plate population as control; VC relative to vehicle control).

Level 5 (SIG) consists of the replicates, usually 3 per treatment, aggregated into a single differential expression vector derived from the weighted averages of the individual replicates.

Value

list of paths to Connectivity Map project files

TRUE

Author(s)

Vitalii Kleshchevnikov

Vitalii Kleshchevnikov

Vitalii Kleshchevnikov


vitkl/regNETcmap documentation built on Feb. 18, 2020, 3:43 a.m.