GRfit: Extract GR parameters from a dataset

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/GRfit.R

Description

This function takes in a dataset with information about concentration, cell counts over time, and additional grouping variables for a dose-response assay and calculates growth-rate inhibition (GR) metrics as well as traditional metrics (IC50, Emax, etc.) for each experiment in the dataset. The data must be in a specific format: either that specified by case "A" or case "C" described in the details below.

Usage

1
GRfit(inputData, groupingVariables, case = "A", force = FALSE, cap = FALSE)

Arguments

inputData

a data table in one of the specified formats (Case A or Case C). See details below for description. See data(inputCaseA) or data(inputCaseC) for example input data frames. See help files for inputCaseA and inputCaseC for description of these examples.

groupingVariables

a vector of column names from inputData. All of the columns in inputData except for those identified here will be averaged over.

case

either "A" or "C", indicating the format of the input data. See below for descriptions of these formats.

force

a logical value indicating whether to attempt to "force" a sigmoidal fit, i.e. whether to allow fits with F-test p-values greater than .05

cap

a logical value indicating whether to cap GR values (or relative cell counts) at 1. If true, all values greater than 1 will be set to 1.

Details

Calculation of GR values is performed by the function .GRcalculate according to the "Online Methods" section of Hafner and Niepel et al. (2016, http://dx.doi.org/10.1038/nmeth.3853).

The fitting of the logistic curve is performed by the .GRlogisticFit function, which calls the drm function from the drc package to solve for the curve parameters. The GR curve fit function is given by f(c) = GRinf + (1 - GRinf)/(1 + (c/GEC50)^h_GR) where c is concentration. The fit is performed under following constraints: h_GR in [.1, 5], GRinf in [-1, 1], and GEC50 in [min(c)*1e-2, max(c)*1e2] (c is concentration). The initial conditions for the fitting algorithm are h_GR = 2, GRinf = 0.1 and GEC50 = median(c). The fitting of the traditional dose response curve is done using the same formula, replacing GRinf with Einf, GEC50 with EC50, and h_GR with h. The fit is performed on the relative cell counts instead of GR values. Also, since the traditional dose response curve is bounded between 0 and 1 whereas the GR dose response curve is bounded between -1 and 1, we restrict Einf to the range [0, 1].

The parameters of the GR dose response curves (and traditional dose response curves) for each experiment are fitted separately. An F-test is used to compare the sigmoidal fit to a flat line fit. If the p-value of the F-test is less than .05, the sigmoidal fit is accepted. If the p-value is greater than or equal to .05, a flat horizontal line fit is given, with y equal to the mean of the GR values (or relative cell counts in the case of the traditional dose response curve). For each flat fit, GEC50 (or EC50) is set to 0, h_GR (or h) is set to 0.01, GRinf (or Einf) is set to the y value of the flat fit, and GR50 (or IC50) is set to +/-Inf depending on whether GRinf (or Einf) is greater or less than .5.

The mandatory columns for inputData for Case "A" are the following as well as other grouping columns.

1. concentration - column with concentration values (not log transformed) of the perturbagen on which dose-response curves will be evaluated

2. cell_count - column with the measure of cell number (or a surrogate of cell number) after treatment

3. cell_count__time0 - column with initial (Time 0) cell counts - the measure of cell number in untreated wells grown in parallel until the time of treatment

4. cell_count__ctrl - column with the Control cell count: the measure of cell number in control (e.g. untreated or DMSO-treated) wells from the same plate

All other columns will be treated as additional keys on which the data will be grouped (e.g. cell_line, drug, time, replicate)

The mandatory columns for inputData for Case "C" are the following as well as other grouping columns.

1. concentration - column with concentration values (not log transformed) of the perturbagen on which dose-response curves will be evaluated

2. cell_count - column with the measure of cell number (or a surrogate of cell number)

3. time - column with the time at which a cell count is observed

All other columns will be treated as additional keys on which the data will be grouped (e.g. cell_line, drug, replicate)

GR values and dose-response curves/metrics can also be computed using division times for (untreated) cell lines in the place of time zero cell counts, using the first formula in the Supplement of Hafner et al. (2017, http://dx.doi.org/10.1038/nbt.3882).

To use division rate instead of initial cell count, inputData should not have any initial cell counts (i.e. For Case "A", no "cell_count__time0" column. For Case "C", no values of 0 in the "time" column) and should instead have two columns "treatment_duration" and "division_time".

In the first column, "treatment duration", one should have the duration of the assay between time of treatment and the final cell counts (e.g. 72 for hours in a typical 3-day assay). In the second column, "division_time", one should have the time it takes for one cell doubling to occur in each (untreated) cell line used under the conditions of the experiment. These two columns must contain numbers (no units), but need to refer to the same units (e.g. hours). In most cases, all experiments of a particular cell line would have the same "division_time", however if the division rate of untreated cells varied on another parameter, for example seeding density, it would be appropriate to measure and input division times based on cell line/seeding density pairs.

Value

A SummarizedExperiment object containing GR metrics (GR50, GRmax, etc.) and traditional metrics (IC50, Emax, etc.) as well as goodness of fit measures is returned. The object also contains, in its metadata, a table of the original data converted to the style of "Case A" (with calculated GR values and relative cell counts for each row) and a vector of the grouping variables used for the calculation.

Note

To see the underlying code, use (getAnywhere(.GRlogistic_3u)), (getAnywhere(.rel_cell_logistic_3u)), (getAnywhere(.GRcalculate)), and (getAnywhere(.GRlogisticFit))

Author(s)

Nicholas Clark

References

Hafner, M., Niepel, M., Chung, M., and Sorger, P.K., "Growth Rate Inhibition Metrics Correct For Confounders In Measuring Sensitivity To Cancer Drugs". Nature Methods 13.6 (2016): 521-527. http://dx.doi.org/10.1038/nmeth.3853

Hafner, M., Niepel, M., Sorger, P.K., "Alternative drug sensitivity metrics improve preclinical cancer pharmacogenomics". Nature Biotechnology 35.6 (2017): 500-502. http://dx.doi.org/10.1038/nbt.3882

See Also

See drm for the general logistic fit function that solves for the parameters GRinf, GEC50, and h_GR. See drmc for options of this function. Use the functions GRdrawDRC, GRbox, and GRscatter to create visualizations using the output from this function. For online GR calculator and browser, see http://www.grcalculator.org.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Load Case A (example 1) input
data("inputCaseA")
head(inputCaseA)
# Run GRfit function with case = "A"
output1 = GRfit(inputData = inputCaseA,
groupingVariables = c('cell_line','treatment','replicate',
'time'))
# Overview of SummarizedExperiment output data
output1
## Not run: 
# View GR metrics table
View(GRgetMetrics(output1))
# View descriptions of each metric (or goodness of fit measure)
View(GRgetDefs(output1))
# View table of original data (converted to style of Case A) with GR values
# and relative cell counts
View(GRgetValues(output1))
# View vector of grouping variables used for calculation
GRgetGroupVars(output1)

## End(Not run)
# Load Case C (example 4) input
# Same data, different format
data("inputCaseC")
head(inputCaseC)
output4 = GRfit(inputData = inputCaseC,
groupingVariables = c('cell_line','treatment', 'replicate',
'time'),
case = "C")
# Extract data tables and export to .tsv or .csv
## Not run: 
# Write GR metrics parameter table to tab-separated text file
write.table(GRgetMetrics(output1), file = "filename.tsv", quote = FALSE,
sep = "\t", row.names = FALSE)
# Write original data plus GR values to comma-separated file
write.table(GRgetValues(output1), file = "filename.csv", quote = FALSE,
sep = ",", row.names = FALSE)

## End(Not run)

uc-bd2k/GRmetrics documentation built on Nov. 11, 2020, 4:10 p.m.