pimaAnalysis: True discovery guarantee in multiverse analysis
In annavesely/sumSome: True Discovery Guarantee by Sum-Based Tests

pimaAnalysis

R Documentation

True discovery guarantee in multiverse analysis

Description

This function uses permutation statistics/p-values to determine a true discovery guarantee for multiverse analysis, when studying one or more parameters of interest within a multiverse of models. It computes confidence bounds for the number of true discoveries and the true discovery proportion overall or within different groups. The bounds are simultaneous over all sets, and remain valid under post-hoc selection.

Usage

pimaAnalysis(obj, by = NULL, type = "sum", r = 0, alpha = 0.05, ...)

Arguments

`obj`	an object of class `jointest`, as obtained from the functions `pima` (package pima) or `join_flipscores` (jointest).
`by`	name of grouping element among `Coeff` and `Model`. If not specified, all coefficients of interest in all models are considered together.
`type`	combining function: `sum` uses the sum of test statistics as in `sumStats`, while different p-value combinations are defined as in `sumPvals` (`edgington`, `fisher`, `pearson`, `liptak`, `cauchy`, `harmonic`, `vovk.wang`).
`r`	parameter for Vovk and Wang's p-value combination.
`alpha`	significance level.
`...`	further parameters of `sumStats` or `sumPvals` (truncation parameters and maximum number of iterations of the algorithm).

Details

In the default by = NULL, the procedure computes lower confidence bounds for the number/proportion of significant effects (non-null coefficients) among all. Other inputs of the argument by return analogous bounds, defined by coefficient ("Coeff") or by model ("Model"). While the bounds are simultaneous over all possible groupings, the combining function type should be fixed in advance.

If truncation parameters are not specified among the further parameters, statistics/p-values are not truncated.

More generically, obj can be any list containing:

Tspace: data frame of statistics, where columns correspond to variables, and rows to data transformations (e.g. permutations). The first transformation is the identity.
summary_table: summary data frame where rows correspond to variables.

In this framework, the grouping element by is the name of a column of summary_table.

Value

pimaAnalysis returns a data frame containing a summary for each subset:

size: number of considered coefficients
TD: lower (1-alpha)-confidence bound for the number of significant effects
TDP: lower (1-alpha)-confidence bound for the proportion of significant effects

References

Girardi P., Vesely A., Lakens D., Altoè G., Pastore M., Calcagnì A., and Finos L. (2024). Post-selection Inference in Multiverse Analysis (PIMA): An Inferential Framework Based on the Sign Flipping Score Test. Psychometrika, doi: 10.1007/s11336-024-09973-6.

Vesely A., Finos L., and Goeman J. J. (2023). Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society, Series B (Statistical Methodology), doi: 10.1093/jrsssb/qkad019.

Examples

# generate matrix of statistics for 2 coefficients X and Z within 3 models
G <- simData(prop = 0.6, m = 6, B = 50, alpha = 0.4, p = FALSE, seed = 42)
colnames(G) <- rep(c("X","Z"),3)
 
# summary table
summary_table <- data.frame(
  Model = rep(c("mod1","mod2","mod3"), each=2),
  Coeff = colnames(G)
)

# list of Tspace and summary_table
obj <- list(Tspace = as.data.frame(G), summary_table = summary_table)

# significant effects overall (sum of test statistics)
pimaAnalysis(obj, alpha = 0.4)

# significant effects by coefficient (sum of test statistics)
pimaAnalysis(obj, by = "Coeff", alpha = 0.4)

# significant effects by model (Fisher's combination of p-values)
pimaAnalysis(obj, by = "Model", type = "fisher", alpha = 0.4)

annavesely/sumSome documentation built on Jan. 28, 2025, 8:15 a.m.