gess_fisher: Fisher Search Method

Description Usage Arguments Details Value Column description References See Also Examples

View source: R/gess_fisher.R

Description

In its iterative form, Fisher's exact test (Upton, 1992) can be used as Gene Expression Signature (GES) Search to scan GES databases for entries that are similar to a query GES.

Usage

1
2
3
4
5
6
7
8
9
gess_fisher(
  qSig,
  higher = NULL,
  lower = NULL,
  padj = NULL,
  chunk_size = 5000,
  ref_trts = NULL,
  workers = 1
)

Arguments

qSig

qSig object defining the query signature including the GESS method (should be 'Fisher') and the path to the reference database. For details see help of qSig and qSig-class.

higher

The 'upper' threshold. If not 'NULL', genes with a score larger than or equal to 'higher' will be included in the gene set with sign +1. At least one of 'lower' and 'higher' must be specified.

higher argument need to be set as 1 if the refdb in qSig is path to the HDF5 file that were converted from the gmt file.

lower

The lower threshold. If not 'NULL', genes with a score smaller than or equal 'lower' will be included in the gene set with sign -1. At least one of 'lower' and 'higher' must be specified.

lower argument need to be set as NULL if the refdb in qSig is path to the HDF5 file that were converted from the gmt file.

padj

numeric(1), cutoff of adjusted p-value or false discovery rate (FDR) of defining DEGs that is less than or equal to 'padj'. The 'padj' argument is valid only if the reference HDF5 file contains the p-value matrix stored in the dataset named as 'padj'.

chunk_size

number of database entries to process per iteration to limit memory usage of search.

ref_trts

character vector. If users want to search against a subset of the reference database, they could set ref_trts as a character vector representing column names (treatments) of the subsetted refdb.

workers

integer(1) number of workers for searching the reference database parallelly, default is 1.

Details

When using the Fisher's exact test (Upton, 1992) as GES Search (GESS) method, both the query and the database are composed of gene label sets, such as DEG sets.

Value

gessResult object, the result table contains the search results for each perturbagen in the reference database ranked by their signature similarity to the query.

Column description

Descriptions of the columns specific to the Fisher method are given below. Note, the additional columns, those that are common among the GESS methods, are described in the help file of the gessResult object.

References

Graham J. G. Upton. 1992. Fisher's Exact Test. J. R. Stat. Soc. Ser. A Stat. Soc. 155 (3). [Wiley, Royal Statistical Society]: 395-402. URL: http://www.jstor.org/stable/2982890

See Also

qSig, gessResult, gess

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
db_path <- system.file("extdata", "sample_db.h5", 
                       package = "signatureSearch")
# library(SummarizedExperiment); library(HDF5Array)
# sample_db <- SummarizedExperiment(HDF5Array(db_path, name="assay"))
# rownames(sample_db) <- HDF5Array(db_path, name="rownames")
# colnames(sample_db) <- HDF5Array(db_path, name="colnames")
## get "vorinostat__SKB__trt_cp" signature drawn from sample databass
# query_mat <- as.matrix(assay(sample_db[,"vorinostat__SKB__trt_cp"]))
# qsig_fisher <- qSig(query=query_mat, gess_method="Fisher", refdb=db_path)
# fisher <- gess_fisher(qSig=qsig_fisher, higher=1, lower=-1)
# result(fisher)

signatureSearch documentation built on April 16, 2021, 6 p.m.