interval_statistics: Interval statistics
In mosaicCore: Common Utilities for Other MOSAIC-Family Packages

coverage

R Documentation

Interval statistics

Description

Calculate coverage intervals and confidence intervals for the sample mean, median, sd, proportion, ... Typically, these will be used within df_stats(). For the mean, median, and sd, the variable x must be quantitative. For proportions, the x can be anything; use the success argument to specify what value you want the proportion of. Default for success is TRUE for x logical, or the first level returned by unique for categorical or numerical variables.

Usage

coverage(x, level = 0.95, na.rm = TRUE)

ci.mean(x, level = 0.95, na.rm = TRUE)

ci.median(x, level = 0.9, na.rm = TRUE)

ci.sd(x, level = 0.95, na.rm = TRUE)

ci.prop(
  x,
  success = NULL,
  level = 0.95,
  method = c("Clopper-Pearson", "binom.test", "Score", "Wilson", "prop.test", "Wald",
    "Agresti-Coull", "Plus4")
)

Arguments

`x`	a variable.
`level`	number in 0 to 1 specifying the confidence level for the interval. (Default: 0.95)
`na.rm`	if `TRUE` disregard missing data
`success`	for proportions, this specifies the categorical level for which the calculation of proportion will be done. Defaults: `TRUE` for logicals for which the proportion is to be calculated.
`method`	for `ci.prop()`, the method to use in calculating the confidence interval. See `mosaic::binom.test()` for details.

Details

Methods: ci.mean() uses the standard t confidence interval. ci.median() uses the normal approximation method. ci.sd() uses the chi-squared method. ci.prop() uses the binomial method. In the usual situation where the mosaic package is available, ci.prop() uses mosaic::binom.test() internally, which provides several methods for the calculation. See the documentation for binom.test() for details about the available methods. Clopper-Pearson is the default method. When used with df_stats(), the confidence interval is calculated for each group separately. For "pooled" confidence intervals, see methods such as lm() or glm().

Value

a named numerical vector with components lower and upper, and, in the case of ci.prop(), center. When used the df_stats(), these components are formed into a data frame.

Note

When using these functions with df_stats(), omit the x argument, which will be supplied automatically by df_stats(). See examples.

Examples

# The central 95% interval
df_stats(hp ~ cyl, data = mtcars, c95 = coverage(0.95))
# The confidence interval on the mean
df_stats(hp ~ cyl, data = mtcars, mean, ci.mean)
# What fraction of cars have 6 cylinders?
df_stats(mtcars, ~ cyl, six_cyl_prop = ci.prop(success = 6, level = 0.90))
# Use without `df_stats()` (rare)
ci.mean(mtcars$hp)

mosaicCore documentation built on Nov. 5, 2023, 9:06 a.m.