library(BenCScore)
This is a quick start guide to using the Bayesian EstimatioN and Compound Scoring (BenCScore) package for high-throughput screening data analysis. Practical use of the package is demonstrated by analyzing a fluorescence polarization screen for inhibitors of the p53-MDMX high-throughput screen conducted at the Broad Institute 2016.
data(mdmx)
This package requires data with columns:
Checking the example mdmx data
head(mdmx)
The screen can be checked for valid names with
checkScreen(mdmx)
The entire screening campaing can be visualized by column-wise plots. The value of the y-axis in the plot is mili-fluorescence polarization. Across the screen there is good separation between the positive and neutral controls (except in the case of the confirmation-in-dose, 7208-01-W01-02-08). A positional effect is observable with mili-fluorescence polarization generally increasing with column number.
columnPlot(mdmx)
The campaign can also be visualized by a scatter plot of replicates. In this plot the positional effect is much more obvious as spurious correlations between in the positive controls and the inactive compounds.
primary <- mdmx[mdmx$Run_Intent == 'PRIMARYSCREEN', ] scatterPlot(primary)
In general normalization must be applied to plates to compare between them and across runs. On 384-well plates there are too few neutral control wells (typically 32) to accurately estimate the standard deviation. For this reason we prefer normalizing plates to percent effect of controls. This function overwrites the Value column stored in the screen.
mdmx <- percentEffect(mdmx)
Plate patterns can be visualized using the wellHeatMap function and positional effects can be fixed using the fixPattern function.
wellHeatMap(mdmx)
#obviously the concentration-response experiments will have positional patterns #7208-01-W01-02-08 should not be altered. primary <- mdmx[mdmx$Run_Intent !='CONFIRMATION_IN_DOSE', ] confirmation <- mdmx[mdmx$Run_Intent == 'CONFIRMATION_IN_DOSE', ] primary <- fixPattern(primary)
Again the replicate scatter plot can be visualized. Note the change in axes. In this plot the positive controls are gaussain scattered on both replicates as are the inactive compounds.
scatterPlot(primary)
The Bayesian model, robust z-score, and RVM t-test can be added to the screening data by simply calling the applyModels function. This step can be time consuming (~12 hours on a 2GHz i7 ).
primary <- addModels(primary, alternative='less', ROPE=-14.28, c=-14.28)
To decide what value to use for the ROPE and c we propose.
For example the p53-MDMX screen was run at 12.5 uM. We decided to call the maximum acceptable EC$_{50}$ 75 uM, based on the maximum concentration we were going to use in our secondary concentration-response screen. From those metrics the autoRope function preforms steps 1-3 to calculate a ROPE value for the Bayesian method.
autoRope(assay.conc = 12.5, max.conc = 75)
Pay attention: the units of autoRope are percent effect and the sign is wrong! Currently the user must provide the correct ROPE to the addModels function.
primary <- fread('~/Documents/Projects/Bayesian Score/7208/Compound Selection/primary.mdmx.analysis.csv')
Hits can be selected from any available model. To select hits from the Bayesian model rank order by the Alternate_Probability_t. These hits can be visualized by their primary screen replicate scatter again by calling the scatterPlot function. Notice how the compounds in the "wings" are not selected.
compounds <- primary[primary$Well_Type == 'compound', ] compounds <- compounds[order(compounds$Alternate_Probability_t, decreasing=TRUE), ] #selecting the top 100 hits hit_ids <- unique(compounds$Compound_ID)[1:100] #plot selection scatterPlot(compounds[compounds$Compound_ID %in% hit_ids, ])
The Bayesian method is able to provide an estimate of how many false negatives a selection leaves behind. The number of false negatives can be found by summing the alternate probability of the compounds not selected, as shown below.
#deduplicate compounds compounds <- compounds[!duplicated(compounds$Compound_ID), ] #The estimated number of false negatives in the current hit selection sum(compounds$Alternate_Probability_t[!(compounds$Compound_ID %in% hit_ids)], na.rm=TRUE)
After hits have been selected and retested in concentration-response curves can be fit using the screenFit function.
confirmation <- screenFit(confirmation)
cpd <- confirmation[confirmation$Well_Type == 'compound', ] cpd <- cpd[cpd$Compound_ID != '', ] cpd <- cpd[order(cpd$IC50), ] id <- unique(cpd$Compound_ID) cpd <- split(cpd, cpd$Compound_ID) curves <- lapply(id, function(x) plotCurves(cpd[[x]]))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.