qSig: Helper Function to Construct a 'qSig' Object

View source: R/qSig-methods.R

qSigR Documentation

Helper Function to Construct a qSig Object

Description

It builds a 'qSig' object to store the query signature, reference database and GESS method used for GESS methods.

Usage

qSig(query, gess_method, refdb)

Arguments

query

If 'gess_method' is 'CMAP' or 'LINCS', it should be a list with two character vectors named upset and downset for up- and down-regulated gene labels, respectively. The labels should be gene Entrez IDs if the reference database is a pre-built CMAP or LINCS database. If a custom database is used, the labels need to be of the same type as those in the reference database.

If 'gess_method' is 'gCMAP', the query is a matrix with a single column representing gene ranks from a biological state of interest. The corresponding gene labels are stored in the row name slot of the matrix. Instead of ranks one can provide scores (e.g. z-scores). In such a case, the scores will be internally transformed to ranks.

If 'gess_method' is 'Fisher', the query is expected to be a list with two character vectors named upset and downset for up- and down-regulated gene labels, respectively (same as for 'CMAP' or 'LINCS' method). Internally, the up/down gene labels are combined into a single gene set when querying the reference database with the Fisher's exact test. This means the query is performed with an unsigned set. The query can also be a matrix with a single numeric column and the gene labels (e.g. Entrez gene IDs) in the row name slot. The values in this matrix can be z-scores or LFCs. In this case, the actual query gene set is obtained according to upper and lower cutoffs in the gess_fisher function set by the user.

If 'gess_method' is 'Cor', the query is a matrix with a single numeric column and the gene labels in the row name slot. The numeric column can contain z-scores, LFCs, (normalized) gene expression intensity values or read counts.

gess_method

one of 'CMAP', 'LINCS', 'gCMAP', 'Fisher' or 'Cor'

refdb

character(1), can be one of "cmap", "cmap_expr", "lincs", "lincs_expr", "lincs2" when using the CMAP/LINCS databases from the affiliated signatureSearchData package. With 'cmap' the database contains signatures of LFC scores obtained from DEG analysis routines; with 'cmap_expr' normalized gene expression values; with 'lincs' or 'lincs2' z-scores obtained from the DEG analysis methods of the LINCS project; and with 'lincs_expr' normalized expression values.

To use a custom database, it should be the file path to the HDF5 file generated with the build_custom_db function, the HDF5 file needs to have the .h5 extension.

When the gess_method is set as 'gCMAP' or 'Fisher', it could also be the file path to the HDF5 file converted from the gmt file containing gene sets by using gmt2h5 function. For example, the gmt files could be from the MSigDB https://www.gsea-msigdb.org/gsea/msigdb/index.jsp or GSKB http://ge-lab.org/#/data.

Value

qSig object

See Also

build_custom_db, signatureSearchData, gmt2h5, qSig-class

Examples

db_path <- system.file("extdata", "sample_db.h5", 
                       package = "signatureSearch")
qsig_lincs <- qSig(query=list(
                     upset=c("230", "5357", "2015", "2542", "1759"), 
                     downset=c("22864", "9338", "54793", "10384", "27000")), 
                   gess_method="LINCS", refdb=db_path)
qmat <- matrix(runif(5), nrow=5)
rownames(qmat) <- c("230", "5357", "2015", "2542", "1759")
colnames(qmat) <- "treatment"
qsig_gcmap <- qSig(query=qmat, gess_method="gCMAP", refdb=db_path)

yduan004/signatureSearch documentation built on Feb. 19, 2024, 9:30 a.m.