gess_lincs: LINCS Search Method

Description Usage Arguments Details Value Column description References See Also Examples

View source: R/gess_lincs.R

Description

Implements the Gene Expression Signature Search (GESS) from Subramanian et al, 2017, here referred to as LINCS. The method uses as query the two label sets of the most up- and down-regulated genes from a genome-wide expression experiment, while the reference database is composed of differential gene expression values (e.g. LFC or z-scores). Note, the related CMAP method uses here ranks instead.

Usage

1
2
3
4
5
6
7
8
gess_lincs(
  qSig,
  tau = FALSE,
  sortby = "NCS",
  chunk_size = 5000,
  ref_trts = NULL,
  workers = 1
)

Arguments

qSig

qSig object defining the query signature including the GESS method (should be 'LINCS') and the path to the reference database. For details see help of qSig and qSig-class.

tau

TRUE or FALSE, whether to compute the tau score. Note, TRUE is only meaningful when the full LINCS database is searched, since accurate Tau score calculation depends on the usage of the exact same database their background values are based on.

sortby

sort the GESS result table based on one of the following statistics: 'WTCS', 'NCS', 'Tau', 'NCSct' or 'NA'

chunk_size

number of database entries to process per iteration to limit memory usage of search.

ref_trts

character vector. If users want to search against a subset of the reference database, they could set ref_trts as a character vector representing column names (treatments) of the subsetted refdb.

workers

integer(1) number of workers for searching the reference database parallelly, default is 1.

Details

Subramanian et al. (2017) introduced a more complex GESS algorithm, here referred to as LINCS. While related to CMAP, there are several important differences among the two approaches. First, LINCS weights the query genes based on the corresponding differential expression scores of the GESs in the reference database (e.g. LFC or z-scores). Thus, the reference database used by LINCS needs to store the actual score values rather than their ranks. Another relevant difference is that the LINCS algorithm uses a bi-directional weighted Kolmogorov-Smirnov enrichment statistic (ES) as similarity metric.

Value

gessResult object, the result table contains the search results for each perturbagen in the reference database ranked by their signature similarity to the query.

Column description

Descriptions of the columns specific to the LINCS method are given below. Note, the additional columns, those that are common among the GESS methods, are described in the help file of the gessResult object.

References

For detailed description of the LINCS method and scores, please refer to: Subramanian, A., Narayan, R., Corsello, S. M., Peck, D. D., Natoli, T. E., Lu, X., Golub, T. R. (2017). A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles. Cell, 171 (6), 1437-1452.e17. URL: https://doi.org/10.1016/j.cell.2017.10.049

See Also

qSig, gessResult, gess

Examples

1
2
3
4
5
6
7
8
db_path <- system.file("extdata", "sample_db.h5", 
                       package = "signatureSearch")
#qsig_lincs <- qSig(query = list(
#                   upset=c("230", "5357", "2015", "2542", "1759"), 
#                   downset=c("22864", "9338", "54793", "10384", "27000")), 
#                   gess_method = "LINCS", refdb = db_path)
#lincs <- gess_lincs(qsig_lincs, sortby="NCS", tau=FALSE)
#result(lincs)

signatureSearch documentation built on April 16, 2021, 6 p.m.