compDrugMethods: Function to compare drug repurposing techniques.

Description Usage Arguments Value Examples

View source: R/compDrugMethods.R

Description

This function operates like the rankDrugsGwc function but allows users to compare multiple drug repurposing methods. Users can either compare different methods, the results of using the same method with different parameters, or can run the packges main function on different data sets. This is accomplished by passing in lists instead of vectors or values, where the n'th list element is used in the n'th analysis. Note that if one is interested in keeping the variables the same but trying different methods, only the drugScoreMeth variable would be in list format. If some variables are lists and other are not or are shorter lists, the supplied variable or last list element will be used in the analyses containing the longer lists variables. A data frame is returned that contains the ranks of the drugs in each method and the average rank of the drug across mutliple methods.

Usage

1
2
3
4
5
compDrugMethods(numAnalyses, geneIds, geneEsts, drugPert, pharmSet, pvals,
  drugScoreMeth, pCut = TRUE, cutOff = 0.05, genesToStart = 0.2,
  numbIters = 10, gwcMethod = "spearman", numbPerms = 1000,
  volcPlotEsts = NULL, drugEst = TRUE, extraData = NA,
  extraCut = NA, extraDirec = NA, drugNameVec = NULL)

Arguments

numAnalyses

an integer specifying how many analyses are intended/the max size of one of the lists passed to the function. Note that not providing

geneIds

a list of character vectors containing the gene symbols, ensemble IDs, or entrez IDs for the genes to be analyzed in each analysis

geneEsts

a list of numeric vectors containing estimates for the difference in expression between the two phenotyes in each analysis. Note that positive values of this estimate (tstat, logFC, etc) must correspond to higher gene expression in the phenotype one would like to reverse

drugPert

a drug perturbation signature (object of class PharmacoSig) with rownames that correspond to the ensemble IDs of the genes

pharmSet

the PharmacoSet used to generate the drug perturbation signature. Supplying this will add additonal info about the drugs in the ranking tables.

pvals

a list of numeric vectors containing P values from a t-test assessing the diferential expression of the genes between the two phenotypes in each analysis

drugScoreMeth

a list of strings specifying which drug repurposing technique to use to score the drugs during each trial in an analysis. The options for this parameter are currently "gwc" and "fgsea". Default is "gwc"

pCut

a list of booleans for each analysis specifying whether to use pvalues to remove insignificant genes from the analysis (TRUE) or to remove genes from the analysis according to their supplied gene estimates (FALSE)

cutOff

if pCut is TRUE then this value represents the p value threshold used to filter out genes. If pCut is FALSE then this value represents the fraction of genes present in both the data and drug perturbation signature with the top absolute value of gene estimates that will be left in the analysis. cutOff should be between 0 and 1.

genesToStart

a list of values between 0 and 1 representing the fraction of significant genes present in the data and drug perturbation signature to use in the first iteration of each analysis. Recommended to be at least 0.15 to avoid large changes during the early iterations due to having too few genes present.

numbIters

a list of integers specifying the number of iterations that will occur in each analysis. numbIters will set the rate at which genes are added to the analysis.

gwcMethod

a list of strings specifying which method to use when computing correlations in the gwc function in each analysis (if gwc is being used). The options are spearman (default) or pearson.

numbPerms

a list of integers specifying the number of permutations to be used to compute the p value in the drug repurposing methods which use permutation tests to calculate p values.

volcPlotEsts

a list of numeric vectors containing gene estimates to be used for volcano plots, if not provided geneEsts will be used. However, if one desires to use t-stats for geneEsts in the gwc analysis it is recommended that one use another estimate (eg logFC) if volcano plots are desired due to the high correlation between t-stats and p values.

drugEst

a list of booleans for each analysis specifying whether to use the estimates for each gene of the drug perturbation signature in the gwc calculation (TRUE) or to use the t-stats for each gene in the drug perturbation signature in the calculations (FALSE). Default is TRUE

extraData

a list of data.frames to be use din each analysis, where each column represents values one would like to inspect to determine if the genes corresponding to these values should be removed if they meet the conditions specified in extraCut and extraDirec. Useful if one would like to remove genes based on logFC or other values. extra data must have the same number of rows as the length of vectors geneIds, geneEsts, and pvals.

extraCut

a list of numeric vectors, with the nth value corresponding to the nth column of the m by n data.frame supplied in extraData, where each value indicates the value that needs to be reached for the columns of the data fram in order for the ids with that value to be kept or removed. Whether removal occurs when a value is greater or less than the value in this vector is specified in the extraDirec variable.

extraDirec

a list of boolean vectors, where TRUE means that ids whose value is greater than that specified in extraVals will be removed and FALSE means those with a value less than conditionVals will be removed.

drugNameVec

a character vector specifying the names of the drugs to run through the pipeline. Useful for saving time by running a higher nperm analysis to get more reliable p values on the top drugs identified in a smaller nperm analysis with all the drugs. Default is NULL and all drugs are tested.

Value

a data frame with information about the drugs in each analysis, including the drugs final scores and the drugs average final scores across the different methods.

Examples

1
2
3
4
5
data("geneDataGwc")
data("drugPertEx")
data("psetSub")
#compare a gwcCmapBox, fgsea, and xsum, analysis that have the same parameters and also a gwcCmapBox analysis with different parameters
compareResults = compDrugMethods(4, geneIds = geneDataGwc$symbol, geneEsts = geneDataGwc$t, drugPert = drugPertEx, pharmSet = psetSub, pvals = geneDataGwc$P.Value, drugScoreMeth = list("gwcCmapBox", "fgsea","xsum", "gwcCmapBox"), pCut = list(TRUE, TRUE, TRUE, FALSE), cutOff = list(0.05, 0.05, 0.05, 0.10), genesToStart = list(0.20, 0.20, 0.20, 0.10), numbIters = 5, gwcMethod = "spearman", numbPerms = 1000, volcPlotEsts = NULL, drugEst = TRUE, extraData = NA, extraCut = NA, extraDirec = NA, drugNameVec = NULL)

bhklab/CMapBox documentation built on Nov. 6, 2019, 8:07 p.m.