options(width = 150) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "center", fig.height = 6, fig.width = 6, out.width = "75%" )
rsimsum
supports custom input values for the true value of the estimand and for confidence intervals limits (used to calculate coverage probability).
To illustrate this feature, we can use the tt
dataset (bundled with rsimsum
):
library(rsimsum) data("tt", package = "rsimsum") head(tt)
This includes the results of a simulation study assessing robustness of the t-test when estimating the difference between means.
The t-test assumes a t distribution, hence confidence intervals for the estimated mean are generally based on the t distribution.
See for instance the example from the t-test documentation (?t.test
):
t.test(extra ~ group, data = sleep)
We can incorporate custom confidence intervals by passing the name of two columns in data
as the ci.limits
argument:
s1 <- simsum(data = tt, estvarname = "diff", true = -1, se = "se", ci.limits = c("conf.low", "conf.high"), methodvar = "method", by = "dgm") summary(s1, stats = "cover")
By doing so, we can incorporate different types of confidence intervals in the analysis of Monte Carlo simulation studies. Compare with the default setting:
s2 <- simsum(data = tt, estvarname = "diff", true = -1, se = "se", methodvar = "method", by = "dgm") summary(s2, stats = "cover")
The ci.limits
is also useful when using non-symmetrical confidence intervals, e.g. when using bootstrapped confidence intervals.
A pair of values can also be passed to rsimsum
as the ci.limits
argument:
s3 <- simsum(data = tt, estvarname = "diff", true = -1, se = "se", ci.limits = c(-1.5, -0.5), methodvar = "method", by = "dgm") summary(s3, stats = "cover")
If you have a better example of the utility of this method please get in touch: I'd love to hear from you!
By default, simsum
will calculate confidence intervals using normal-theory, Wald-type intervals.
It is possible to use t-based critical values by providing a column for the (replication-specific) degrees of freedom (analogously as passing confidence bounds to ci.limits
):
s4 <- simsum(data = tt, estvarname = "diff", true = -1, se = "se", df = "df", methodvar = "method", by = "dgm")
Given that the confidence intervals in (conf.low
, conf.high
) are obtained by using critical values from a t distribution, the results of s4
will be equivalent to the results of s1
:
all.equal(tidy(s1), tidy(s4))
We can pass a column of values for true
as well:
tt$true <- -1 s5 <- simsum(data = tt, estvarname = "diff", true = "true", se = "se", ci.limits = c("conf.low", "conf.high"), methodvar = "method", by = "dgm") summary(s5, stats = "cover")
Compare with the default settings:
summary(s2, stats = "cover")
Finally, we could have multiple columns identifying methods as well.
This uses the MIsim
and MIsim2
datasets, which are bundled with {rsimsum}:
data("MIsim", package = "rsimsum") data("MIsim2", package = "rsimsum") head(MIsim) head(MIsim2)
The syntax when calling simsum()
is pretty much the same:
s6 <- simsum(data = MIsim, estvarname = "b", true = 0.50, se = "se", methodvar = "method") s7 <- simsum(data = MIsim2, estvarname = "b", true = 0.50, se = "se", methodvar = c("m1", "m2"))
See the inferred methods:
print(s6) print(s7)
And of course, the estimated performance measures are the same:
all.equal(tidy(s6)$est, tidy(s7)$est)
multisimsum
can be as flexible as simsum
.
Remember the default behaviour:
data("frailty", package = "rsimsum") ms1 <- multisimsum( data = frailty, par = "par", true = c(trt = -0.50, fv = 0.75), estvarname = "b", se = "se", methodvar = "model", by = "fv_dist" ) summary(ms1, stats = "bias")
In this example, we pass the true values of each estimand as the named vector c(trt = -0.50, fv = 0.75)
.
Say instead we stored the true value of each estimand as a column in our dataset:
frailty$true <- ifelse(frailty$par == "trt", -0.50, 0.75) head(frailty)
With this data structure, we can pass a string value to multisimsum
that will identify the true
column in our dataset:
ms2 <- multisimsum( data = frailty, par = "par", true = "true", estvarname = "b", se = "se", methodvar = "model", by = "fv_dist" ) summary(ms2, stats = "bias")
We can confirm that we obtain the same results with the two approaches:
identical(tidy(ms1), tidy(ms2))
This approach is particularly useful when the true value might vary across replications (e.g. when it depends on the simulated dataset).
Of course, it can be combined with custom confidence interval limits for coverage as well:
frailty$conf.low <- frailty$b - qt(1 - 0.05 / 2, df = 10) * frailty$se frailty$conf.high <- frailty$b + qt(1 - 0.05 / 2, df = 10) * frailty$se ms3 <- multisimsum( data = frailty, par = "par", true = "true", estvarname = "b", se = "se", methodvar = "model", by = "fv_dist", ci.limits = c("conf.low", "conf.high") ) summary(ms3, stats = "cover")
This will be completely different than before:
summary(ms2, stats = "cover")
Multiple columns identifying methods are supported with multisimsum()
as well; examples are omitted here, but it works analogously as with simsum()
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.