runTBsigProfiler: Run TB gene signature profiling.

Description Usage Arguments Value Source References Examples

View source: R/profile.R

Description

Using some subset of the signatures listed in TBsignatures and specified scoring algorithms, this function runs gene signature profiling on an input gene expression dataset. It allows for scores to be computed for these signatures which can be compared using various visualization tools also provided in the TBSignatureProfiler package.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
runTBsigProfiler(
  input,
  useAssay = NULL,
  signatures = NULL,
  algorithm = c("GSVA", "ssGSEA", "ASSIGN", "PLAGE", "Zscore", "singscore"),
  combineSigAndAlgorithm = FALSE,
  assignDir = NULL,
  outputFormat = NULL,
  parallel.sz = 0,
  ASSIGNiter = 1e+05,
  ASSIGNburnin = 50000
)

Arguments

input

an input data object of the class SummarizedExperiment, data.frame, or matrix containing gene expression data. Required.

useAssay

a character string specifying the assay to use for signature profiling when input is a SummarizedExperiment. Required only for input data of the class SummarizedExperiment. If null, the assay used will be "counts". The default is NULL.

signatures

a list of signatures to run with their associated genes. This list should be in the same format as TBsignatures, included in the TBSignatureProfiler package. If signatures = NULL, the default set of signatures TBsignatures list is used. For details, run ?TBsignatures. The default is NULL.

algorithm

a vector of algorithms to run, or character string if only one is desired. The default is c("GSVA", "ssGSEA", "ASSIGN", "PLAGE", "Zscore", "singscore").

combineSigAndAlgorithm

logical, not supported if input is a SummarizedExperiment object (in which case, the default is TRUE). For a matrix or data frame, if TRUE, the row names will be in the form <algorithm>_<signature>. If FALSE, there will be a column named 'algorithm' that lists which algorithm is used, and a column named 'pathway' that lists the signature profiled. If NULL, and one algorithm was used, the algorithm will not be listed. The default is FALSE.

assignDir

a character string naming a directory to save intermediate ASSIGN results if algorithm specifies "ASSIGN". The default is NULL, in which case intermediate results will not be saved.

outputFormat

a character string specifying the output data format. Possible values are "SummarizedExperiment", "matrix", or "data.frame". The default is to return the same type as the input object.

parallel.sz

an integer identifying the number of processors to use when running the calculations in parallel for the GSVA and ssGSEA algorithms. If parallel.sz = 0, all cores are used. The default is 0.

ASSIGNiter

an integer indicating the number of iterations to use in the MCMC for the ASSIGN algorithm. The default is 100,000.

ASSIGNburnin

an integer indicating the number of burn-in iterations to use in the MCMC for the ASSIGN algorithm. These iterations are discarded when computing the posterior means of the model parameters. The default is 50,000.

Value

A SummarizedExperiment object, data.frame, or matrix of signature profiling results. The returned object will be of the format specified in outputFormat. If input is a SummarizedExperiment and outputFormat = "SummarizedExperiment", then the output will retain any input information stored in the input colData. In general, if outputFormat = "SummarizedExperiment" then columns in the colData will include the scores for each desired signature with samples on the rows. If input is a data.frame or matrix, then the returned object will have signatures on the rows and samples on the columns.

Source

Profiling for the Z-Score, PLAGE, GSVA, ssGSEA algorithms are all conducted with the Bioconductor GSVA package. Profiling for the singscore algorithm is conducted with the Bioconductor singscore package.

References

Barbie, D.A., Tamayo, P., Boehm, J.S., Kim, S.Y., Moody, S.E., Dunn, I.F., Schinzel, A.C., Sandy, P., Meylan, E., Scholl, C., et al. (2009). Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108-112. doi: 10.1038/nature08460.

Foroutan, M. et al. (2018). Single sample scoring of molecular phenotypes. BMC Bioinformatics, 19. doi: 10.1186/s12859-018-2435-4.

Lee, E. et al. (2008). Inferring pathway activity toward precise disease classification. PLoS Comp Biol, 4(11):e1000217. doi: 10.1371/journal.pcbi.1000217

Shen, Y. et al. (2015). ASSIGN: context-specific genomic profiling of multiple heterogeneous biological pathways. Bioinformatics, 31, 1745-1753. doi: 10.1093/bioinformatics/btv031.

Subramanian, A. et al. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 102, 15545-15550. doi: 10.1073/pnas.0506580102.

Tomfohr, J. et al. (2005). Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics, 6:225. doi: 10.1186/1471-2105-6-225

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## Using a data.frame input/output
 # Create some toy data to test Zak_RISK_16 signature, using 5 samples with low
 # expression & five samples with high expression of the signatures genes.
df_testdata <- as.data.frame(rbind(matrix(c(rnorm(80), rnorm(80) + 5), 16, 10,
                             dimnames = list(TBsignatures$Zak_RISK_16,
                             paste0("sample", seq_len(10)))),
                      matrix(rnorm(1000), 100, 10,
                             dimnames = list(paste0("gene", seq_len(100)),
                             paste0("sample", seq_len(10))))))
res <- runTBsigProfiler(input = df_testdata,
                        signatures = TBsignatures["Zak_RISK_16"],
                        algorithm = c("GSVA", "ssGSEA"),
                        combineSigAndAlgorithm = FALSE,
                        parallel.sz = 1)
subset(res, res$pathway == "Zak_RISK_16")

## Using a SummarizedExperiment input/output
 # The TB_indian SummarizedExperiment data is included in the package.
GSVA_res <- runTBsigProfiler(input = TB_indian,
                             useAssay = "logcounts",
                             signatures = TBsignatures["Zak_RISK_16"],
                             algorithm = c("GSVA"),
                             combineSigAndAlgorithm = FALSE,
                             parallel.sz = 1)
GSVA_res$Zak_RISK_16

TBSignatureProfiler documentation built on Nov. 8, 2020, 6:56 p.m.