gess_cor: Correlation-based Search Method

Description Usage Arguments Details Value Column description See Also Examples

View source: R/gess_cor.R

Description

Correlation-based similarity metrics, such as Spearman or Pearson coefficients, can be used as Gene Expression Signature Search (GESS) methods. As non-set-based methods, they require quantitative gene expression values for both the query and the database entries, such as normalized intensities or read counts from microarrays or RNA-Seq experiments, respectively.

Usage

1
2
3
4
5
6
7
gess_cor(
  qSig,
  method = "spearman",
  chunk_size = 5000,
  ref_trts = NULL,
  workers = 1
)

Arguments

qSig

qSig object defining the query signature including the GESS method (should be 'Cor') and the path to the reference database. For details see help of qSig and qSig-class.

method

One of 'spearman' (default), 'kendall', or 'pearson', indicating which correlation coefficient to use.

chunk_size

number of database entries to process per iteration to limit memory usage of search.

ref_trts

character vector. If users want to search against a subset of the reference database, they could set ref_trts as a character vector representing column names (treatments) of the subsetted refdb.

workers

integer(1) number of workers for searching the reference database parallelly, default is 1.

Details

For correlation searches to work, it is important that both the query and reference database contain the same type of gene identifiers. The expected data structure of the query is a matrix with a single numeric column and the gene labels (e.g. Entrez Gene IDs) in the row name slot. For convenience, the correlation-based searches can either be performed with the full set of genes represented in the database or a subset of them. The latter can be useful to focus the computation for the correlation values on certain genes of interest such as a DEG set or the genes in a pathway of interest. For comparing the performance of different GESS methods, it can also be advantageous to subset the genes used for a correlation-based search to same set used in a set-based search, such as the up/down DEGs used in a LINCS GESS. This way the search results of correlation- and set-based methods can be more comparable because both are provided with equivalent information content.

Value

gessResult object, the result table contains the search results for each perturbagen in the reference database ranked by their signature similarity to the query.

Column description

Descriptions of the columns specific to the corrleation-based GESS method are given below. Note, the additional columns, those that are common among the GESS methods, are described in the help file of the gessResult object.

See Also

qSig, gessResult, gess

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
db_path <- system.file("extdata", "sample_db.h5", 
                       package = "signatureSearch")
# library(SummarizedExperiment); library(HDF5Array)
# sample_db <- SummarizedExperiment(HDF5Array(db_path, name="assay"))
# rownames(sample_db) <- HDF5Array(db_path, name="rownames")
# colnames(sample_db) <- HDF5Array(db_path, name="colnames")
## get "vorinostat__SKB__trt_cp" signature drawn from sample databass
# query_mat <- as.matrix(assay(sample_db[,"vorinostat__SKB__trt_cp"]))
# qsig_sp <- qSig(query = query_mat, gess_method = "Cor", refdb = db_path)
# sp <- gess_cor(qSig=qsig_sp, method="spearman")
# result(sp)

signatureSearch documentation built on April 16, 2021, 6 p.m.