benchmark_functions | R Documentation |
The 'benchmark' functions perform benchmarking for models using the Generalized Order-Restricted Information Criterion (Approximation) (GORIC(A)).
benchmark(object, model_type = c("asymp", "means"), ...)
benchmark_means(object, pop_es = NULL, ratio_pop_means = NULL,
group_size = NULL, alt_group_size = NULL,
quant = NULL, iter = 1000,
control = list(),
ncpus = 1, seed = NULL, ...)
benchmark_asymp(object, pop_est = NULL, sample_size = NULL,
alt_sample_size = NULL, quant = NULL, iter = 1000,
control = list(),
ncpus = 1, seed = NULL, ...)
## S3 method for class 'benchmark'
print(x, output_type = c("rgw", "gw", "rlw", "ld", "all"),
hypo_rate_threshold = 1, color = TRUE, ...)
## S3 method for class 'benchmark'
plot(x, output_type = c("rgw", "rlw", "gw", "ld"),
percentiles = NULL, x_lim = c(), log_scale = FALSE,
alpha = 0.50, nrow_grid = NULL, ncol_grid = 1,
distr_grid = FALSE, ...)
object |
An object of class |
model_type |
If "means", the model parameters reflect (adjusted) means, else
model_type = "asymp" (default). See details for more information about |
x |
An object of class |
pop_es |
A scalar or a vector of population Cohen's f (effect-size) values. By default, it benchmarks ES = 0 (no-effect) and the observed Cohen's f. |
pop_est |
A 1 x k vector or an n x k matrix of population estimates to benchmark. By default, all estimates are set to zero (no-effect) and the observed estimates from the sample are used. |
ratio_pop_means |
A 1 x k vector denoting the relative difference
between the k group means. Note that a ratio of |
group_size |
If the GORICA object is based on estimates and their covariance matrix (instead of on a model/fit object), this should be a 1 x k vector or a scalar to denote the group sizes. If a scalar is specified, it is assumed that each group is of that size. |
alt_group_size |
An 1 x k vector or a scalar to denote alternative group sizes, if you want to use sizes different from those in the data. This can be used, for example, to see the values to which the GORIC(A) weights will converge (and thus to see the maximum value of the weights). If a scalar is specified, it is assumed that each group is of that size. By default, the group sizes from the data are used. |
sample_size |
A scalar to denote the (total) sample sizes. Only used if
the GORIC object is based on estimates and their covariance matrix (instead
of on a model/fit object) or |
alt_sample_size |
A scalar to denote an alternative sample size if you want to use a different sample size from the one in the data. This can be used, for example, to see the values to which the GORIC(A) weights will converge (and thus to see the maximum value of the weights). |
quant |
Quantiles for benchmarking results. Defaults 5%, 35%, 50%, 65%, 95%. |
iter |
The number of iterations for benchmarking. Defaults to |
hypo_rate_threshold |
A numeric value specifying the threshold for the hypothesis rate. The function calculates the proportion of ratio-of-goric-weights that exceeds this threshold. Defaults to 1. |
control |
A list of control parameters.For more information, see details goric. |
ncpus |
Number of CPUs to use for parallel processing. Defaults to |
seed |
A seed for random number generation. |
output_type |
A character vector specifying the type of output to print
or plot. Options are |
color |
If TRUE, the output will include ANSI color coding. Set |
alpha |
Alpha refers to the opacity of a geom. Values of alpha range from 0 to 1, with lower values corresponding to more transparent colors. |
nrow_grid |
An integer value representing the number of rows in the grid layout. |
ncol_grid |
An integer value representing the number of columns in the grid layout. |
distr_grid |
If TRUE, the facet_grid function is used to create a grid of separate plots for each effect-size (estimates). |
percentiles |
A numeric vector specifying the percentiles to be shown. By default
the percentiles are inherited from the quantiles used for benchmarking, see |
x_lim |
A numeric vector of length 2 specifying the x-axis limits. Defaults to |
log_scale |
logical, If TRUE, The x-axis is transformed using a base-10 logarithmic scale. This transformation adjusts the way the data is visualized on the x-axis, but does not alter the underlying data values themselves. |
... |
See goric. |
The function benchmark_asymp
is named as such because it generates data from a
multivariate normal distribution with means equal to the population parameter
estimates and a covariance matrix derived from the original data. This is based
on the assumption that parameter estimates are asymptotically normally distributed.
This assumption is valid for many statistical models, including parameters from
a generalized linear model (GLM). In such models, as the sample size increases,
the distribution of the parameter estimates tends to a normal distribution,
allowing us to utilize the multivariate normal distribution for benchmarking.
benchmark_means
benchmarks the group means of a given GORIC(A) object
by evaluating various population effect sizes and comparing the observed
group means against these benchmarks.
benchmark_asymp
benchmarks the population estimates of a given
GORIC(A) object by evaluating various population estimates and comparing them
against the observed estimates.
print.benchmark
prints the results of benchmark analyses performed on
objects of class benchmark
.
plot.benchmark
generates density plots for benchmark analyses of objects
of class benchmark
.
The benchmark function leverages the future package for parallel processing,
allowing users to speed up computations by distributing tasks across multiple
cores or machines. If the user does not specify a parallelization plan using
future::plan()
, the package will choose an appropriate strategy based
on the user's operating system. Specifically, on Windows, the package defaults
to using multisession
, which creates separate R sessions for each
parallel task. On Unix-like systems (such as Linux and macOS), the package
defaults to multicore
, which uses forked R processes to avoid the
overhead of setting up separate R sessions.
The plan()
must be specified before running the benchmark function, e.g.,
future::plan(future::multisession, workers = ncpus)
benchmark_means
and benchmark_asymp
return a list of
class benchmark_means
, benchmark
, and list
or
benchmark_asymp
, benchmark
, and list
containing the
results of the benchmark analysis.
print.benchmark
does not return a value. It prints formatted benchmark
analysis results to the console.
plot.benchmark
returns a gtable object that can be displayed or further
customized using various functions from the gridExtra and grid packages. This
allows for flexible and detailed adjustments to the appearance and layout of the plot.
Leonard Vanbrabant and Rebecca Kuiper
set.seed(1234)
# Generate data for 4 groups with different group sizes
group1 <- rnorm(10, mean = 5, sd = 0.1)
group2 <- rnorm(20, mean = 5.5, sd = 1)
group3 <- rnorm(30, mean = 6, sd = 0.5)
group4 <- rnorm(40, mean = 6.5, sd = 0.8)
# Combine data into a data frame
data <- data.frame(
value = c(group1, group2, group3, group4),
group = factor(rep(1:4, times = c(10,20,30,40)))
)
# Perform ANOVA
anova_result <- aov(value ~ -1 + group, data = data)
# model/hypothesis
h1 <- 'group1 < group2 < group3 < group4'
h2 <- 'group1 > group2 < group3 < group4'
# fit h1 and h2 model against the unconstrained model (i.e., failsafe to avoid
# selecting a weak hypothesis)
fit_goric <- goric(anova_result, hypotheses = list(H1 = h1, H2 = h2),
comparison = "unconstrained", type = "goric")
# by default: ES = 0 \& ES = observed ES
# In practice you want to increase the number of iterations (default = 1000).
# multisession supports windows machines
# future::plan(future::multisession, workers = ncpus)
benchmark_results_mean <- benchmark(fit_goric, iter = 10, model_type = "means")
print(benchmark_results_mean)
# by default the ratio of GORIC weights for the preferred hypothesis (here h1) is
# plotted against its competitors (i.e., h2 and the unconstrained). To improve
# the readability of the plot, the argument hypothesis_comparison can be used to
# focus on a specif competitor. Further readability can be achieved by setting
# the x_lim option.
plot(benchmark_results_mean, output_type = "rgw")
# specify custom effect-sizes
benchmark_results_mean_es <- benchmark(fit_goric, iter = 10,
pop_es = c(0, 0.1),
model_type = "means")
print(benchmark_results_mean_es)
# Benchmark asymptotic estimates
fit_gorica <- goric(anova_result, hypotheses = list(h1=h1),
comparison = "complement", type = "gorica")
# by default: no-effect \& estimates from the sample are used
benchmark_results_asymp <- benchmark(fit_gorica, sample_size = 30, iter = 5,
model_type = "asymp")
print(benchmark_results_asymp)
# specify custom population estimates
my_pop_est <- rbind("no" = c(0,0,0,0), "observed"= coef(anova_result))
benchmark_results_asymp <- benchmark(fit_gorica, sample_size = 30,
iter = 5, pop_est = my_pop_est,
model_type = "asymp")
print(benchmark_results_asymp)
plot(benchmark_results_asymp, x_lim = c(0, 75))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.