cTRAP: Identification of candidate causal perturbations from differential gene expression data

Time and memory profiling of cTRAP

Nuno Agostinho, 27 November 2020

cTRAP is a multi-threaded R package composed of three modules. These scripts test the critical module of ranking user-provided differential expression results against differential expression results from CMap perturbations.

They also benchmark the prediction of targeting drugs (using the NCI60 gene expression and drug sensitivity association, the most time-consuming option) and drug set enrichment analysis.

1.8.0: release version for reference
pre-1.10.0 (b13ee45): faster GSEA-based score calculation
pre-1.10.0 (c34566c): avoid redundant loading of chunks from CMap perturbation data
pre-1.10.0 (3e3720d): multi-thread support in systems that support forking (e.g. Linux and macOS, but not Windows) and print times for measurable actions (to directly compare with memory profile)
pre-1.10.0 (9b96229): fix issues with missing values when preparing drug descriptor sets
pre-1.10.0 (9852a1a): improve drug set enrichment analysis (fix bugs and allow to match compounds as done by plotTargetingDrugsVSsimilarPerturbations()
pre-1.10.0 (296f9b2): minimise RAM usage when predicting targeting drugs while using NCI60 gene expression and drug sensitivity correlation matrix

Run runRankCMapPerturbations.sh to profile time using Sys.time() (no debugger attached)
Run runRankCMapPerturbations_heaptrack.sh to profile memory with heaptrack memory profiler (timed with Sys.time())
Convert heaptrack output to massif version (so we can plot in R) via convertHeaptrackToMassif.sh
Plot heap memory profiling with R/memoryConsumptionPlot.R

User-provided data: named numeric vector containing t-statistics of differential expression (name corresponds to the gene symbol)
CMap perturbations: publicly available differential expression z-scores; ~21GB file automatically downloaded

CMap perturbation data is first filtered according to available variables (cell lines, timepoints, drug dosage, perturbation types). Only the data matching the user criteria is loaded into memory.

CMap perturbation types tested: - knockdown: consensus signature from shRNAs targeting the same gene - overexpression: cDNA for overexpression of wild-type gene - compound

Given that the CMap perturbation data is too big for usually available RAM, there are two options of loading CMap perturbation data: - On-demand (default): load ~1GB chunks of filtered z-scores while comparing data - Pre-load: load all filtered z-scores into memory before comparing data

CMap data is ranked against user-provided differential expression results. The less similar the data, the higher the final rank value. Similarity is measured using: - Spearman's correlation coefficient - Pearson's correlation coefficient - GSEA-based score (weighted connectivity score as described in CMap original article)

The values of these scores are ranked. The ranks themselves are then summarised via the rank product's rank (i.e. the final rank).

nuno-agostinho/cTRAP documentation built on Jan. 2, 2025, 12:11 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nuno-agostinho/cTRAP
Identification of candidate causal perturbations from differential gene expression data

dev/benchmark/README.md
In nuno-agostinho/cTRAP: Identification of candidate causal perturbations from differential gene expression data

Time and memory profiling of cTRAP

cTRAP performance milestones (dev versions)

General instructions

Ranking CMap perturbations

Input

CMap perturbation data loading

Similarity ranking

R Package Documentation

Browse R Packages

We want your feedback!

nuno-agostinho/cTRAP Identification of candidate causal perturbations from differential gene expression data

dev/benchmark/README.md In nuno-agostinho/cTRAP: Identification of candidate causal perturbations from differential gene expression data

Time and memory profiling of cTRAP

cTRAP performance milestones (dev versions)

General instructions

Ranking CMap perturbations

Input

CMap perturbation data loading

Similarity ranking

R Package Documentation

Browse R Packages

We want your feedback!

nuno-agostinho/cTRAP
Identification of candidate causal perturbations from differential gene expression data

dev/benchmark/README.md
In nuno-agostinho/cTRAP: Identification of candidate causal perturbations from differential gene expression data