query_combos: Get overlap between query and predicted drug combination...
In alexvpickering/ccmap: Combination Connectivity Mapping

Description Usage Arguments Details Value Examples

View source: R/query_combos.R

Drugs with the largest positive and negative cosine similarity are predicted to, respectively, mimic and reverse the query signature. Values range from +1 to -1.

1
2
3

query_combos(query_genes, drug_info = c("cmap", "l1000"),
  method = c("average", "ml"), include = NULL,
  ncores = parallel::detectCores())

`query_genes`	Named numeric vector of differentual expression values for query genes. Usually 'meta' slot of `get_dprimes` result.
`drug_info`	Character vector specifying which dataset to query (either 'cmap' or 'l1000'). Can also provide a matrix of differential expression values for drugs or drug combinations (rows are genes, columns are drugs).
`method`	One of 'average' (default) or 'ml' (machine learning - see details and vignette).
`include`	Character vector of drug names for which combinations with all other drugs will be predicted and queried. If `NULL` (default), all two drug combinations will be predicted and queried.
`ncores`	Integer, number of cores to use for method 'average'. Default is to use all cores.

To predict and query all 856086 two-drug cmap combinations, the 'average' method can take as little as 10 minutes (Intel Core i7-6700). The 'ml' (machine learning) method takes two hours on the same hardware and requires ~10GB of RAM but is slightly more accurate. Both methods will run faster by specifying only a subset of drugs using the include parameter. To speed up the 'ml' method, the MRO+MKL distribution of R can help substantially (link). The combinations of LINCS l1000 signatures (~26 billion) can also be queried using the 'average' method. In order to compare l1000 results to those obtained with cmap, only the same genes should be queried (see example).

Vector of cosine similarities between query and drug combination signatures.

library(lydata)
library(crossmeta)

# location of data
data_dir <- system.file("extdata", package = "lydata")

# gather GSE names
gse_names  <- c("GSE9601", "GSE15069", "GSE50841", "GSE34817", "GSE29689")

# load previous analysis
anals <- load_diff(gse_names, data_dir)

# perform meta-analysis
es <- es_meta(anals)

# get dprimes
dprimes <- get_dprimes(es)

# query combinations of metformin and all other cmap drugs
top_met_combos <- query_combos(dprimes$all$meta, include = 'metformin', ncores = 1)

# previous query but with machine learning method
# top_met_combos <- query_combos(dprimes$all$meta, method = 'ml', include = 'metformin')

# query all cmap drug combinations
# top_combos <- query_combos(dprimes$all$meta)

# query all cmap drug combinations with machine learning method
# top_combos <- query_combos(dprimes$all$meta, method = 'ml')

# query l1000 and cmap using same genes
# library(ccdata)
# data(cmap_es)
# data(l1000_es)
# cmap_es <- cmap_es[row.names(l1000_es), ]

# met_cmap  <- query_combos(dprimes$all$meta, cmap_es,  include = 'metformin')
# met_l1000 <- query_combos(dprimes$all$meta, l1000_es, include = 'metformin')