seqBoundariesGrid: Evaluate expected utility for parametric sequential stopping...
In gaga: GaGa hierarchical model for high-throughput data analysis

Description Usage Arguments Details Value Author(s) References See Also

Estimate the expected utility for sequential boundaries parameterized by (b0,b1). Expected utility is estimated on a grid of (b0,b1) values based on a forward simulation output such as that generated by the function forwsimDiffExpr.

1	seqBoundariesGrid(b0, b1, forwsim, samplingCost, powmin = 0, f = "linear", ineq = "less")

`b0`	Vector with b0 values. Expected utility is evaluated for a grid defined by all combinations of (b0,b1) values.
`b1`	Vector with b1 values.
`forwsim`	`data.frame` with forward simulation output, such as that returned by the function `forwsimDiffExpr`. It must have columns named `simid`, `time`, `u`, `fdr`, `fnr`, `power` and `summary`. See `forwsimDiffExpr` for details on the meaning of each column.
`samplingCost`	Cost of obtaining one more data batch, in terms of the number of new truly differentially expressed discoveries that would make it worthwhile to obtain one more data batch.
`powmin`	Constraint on power. Optimization chooses the optimal `b0`, `b1` satisfying power>=powermin (if such `b0`,`b1` exists).
`f`	Parametric form for the stopping boundary. Currently only 'linear' and 'invsqrt' are implemented. For 'linear', the boundary is `b0+b1*time`. For 'invsqrt', the boundary is `b0+b1/sqrt(time)`, where time is the sample size measured as number of batches.
`ineq`	For `ineq=='less'` the trial stops when `summary` is below the stopping boundary. This is appropriate whenever `summary` measures the potential benefit of obtaining one more data batch. For `ineq=='greater'` the trial stops when `summary` is above the stopping boundary. This is approapriate whenever `summary` measures the potential costs of obtaining one more data batch.

Intuitively, the goal is to stop collecting new data when the expected benefit of obtaining one more data batch is small, i.e. below a certain boundary. We consider two simple parametric forms for such a boundary (linear and inverse square root), which allows to easily evaluate the expected utility for each boundary within a grid of parameter values. The optimal boundary is defined by the parameter values achieving the largest expected utility, restricted to parameter values with an estimated power greater or equal than powmin. Here power is defined as the expected number of true discoveries divided by the expected number of differentially expressed entities.

The routine evaluates the expected utility, as well as expected FDR, FNR, power and sample size for each specified boundary, and also reports the optimal boundary.

A list with two components:

`opt`	Vector with optimal stopping boundary (`b`), estimated expected utility (`u`), false discovery rate (`fdr`), false negative rate (`fnr`), power (`power`) and the expected sample size measured as the number of batches (`time`).
`grid`	`data.frame` with all evaluated boundaries (columns `b0` and `b1`) and their respective estimated expected utility, false discovery rate, false negative rate, power and expected sample size (measured as the number of batches).