CoefVar | R Documentation |
Calculates the coefficient of variation and its confidence limits using various methods.
CoefVar(x, ...)
## S3 method for class 'lm'
CoefVar(x, unbiased = FALSE, na.rm = FALSE, ...)
## S3 method for class 'aov'
CoefVar(x, unbiased = FALSE, na.rm = FALSE, ...)
## Default S3 method:
CoefVar(x, weights = NULL, unbiased = FALSE,
na.rm = FALSE, ...)
CoefVarCI(K, n, conf.level = 0.95,
sides = c("two.sided", "left", "right"),
method = c("nct","vangel","mckay","verrill","naive"))
x |
a (non-empty) numeric vector of data values. |
weights |
a numerical vector of weights the same length as |
unbiased |
logical value determining, if a bias correction should be used (see. details). Default is FALSE. |
K |
the coefficient of variation as calculated by |
n |
the number of observations used for calculating the coefficient of variation. |
conf.level |
confidence level of the interval. Defaults to 0.95. |
sides |
a character string specifying the side of the confidence interval, must be one of |
method |
character string specifing the method to use for calculating the confidence intervals, can be one out of:
|
na.rm |
logical. Should missing values be removed? Defaults to FALSE. |
... |
further arguments (not used here). |
In order for the coefficient of variation to be an unbiased estimate of the true population value, the coefficient of variation is corrected as:
CV_{korr} = CV \cdot \left( 1 - \frac{1}{4\cdot(n-1)} + \frac{1}{n} \cdot CV^2 + \frac{1}{2 \cdot (n-1)^2} \right)
For determining
the confidence intervals
for the coefficient of variation a number of methods have been proposed. CoefVarCI()
currently supports five different methods.
The details for the methods are given in the specific references.
The "naive" method
is based on dividing the standard confidence limit for the standard deviation by the sample mean.
McKay's
approximation is asymptotically exact as n goes to infinity. McKay recommends this approximation only if the coefficient of variation is less than 0.33. Note that if the coefficient of variation is greater than 0.33, either the normality of the data is suspect or the probability of negative values in the data is non-neglible. In this case, McKay's approximation may not be valid. Also, it is generally recommended that the sample size should be at least 10 before using McKay's approximation.
Vangel's modified McKay method
is more accurate than the McKay in most cases, particilarly for small samples.. According to Vangel, the unmodified McKay is only more accurate when both the coefficient of variation and alpha are large. However, if the coefficient of variation is large, then this implies either that the data contains negative values or the data does not follow a normal distribution. In this case, neither the McKay or the modified McKay should be used.
In general, the Vangel's modified McKay method is recommended over the McKay method. It generally provides good approximations as long as the data is approximately normal and the coefficient of variation is less than 0.33. This is the default method.
See also: https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/coefvacl.htm
nct
uses the noncentral t-distribution to calculate the confidence intervals. See Smithson (2003).
if no confidence intervals are requested:
the estimate as numeric value (without any name)
else a named numeric vector with 3 elements
est |
estimate |
lwr.ci |
lower confidence interval |
upr.ci |
upper confidence interval |
Andri Signorell <andri@signorell.net>,
Michael Smithson <michael.smithson@anu.edu.au> (noncentral-t)
McKay, A. T. (1932). Distribution of the coefficient of variation and the extended t distribution, Journal of the Royal Statistical Society, 95, 695–698.
Johnson, B. L., Welch, B. L. (1940). Applications of the non-central t-distribution. Biometrika, 31, 362–389.
Mark Vangel (1996) Confidence Intervals for a Normal Coefficient of Variation, American Statistician, Vol. 15, No. 1, pp. 21-26.
Kelley, K. (2007). Sample size planning for the coefcient of variation from the accuracy in parameter estimation approach. Behavior Research Methods, 39 (4), 755-766
Kelley, K. (2007). Constructing confidence intervals for standardized effect sizes: Theory, application, and implementation. Journal of Statistical Software, 20 (8), 1-24
Smithson, M.J. (2003) Confidence Intervals, Quantitative Applications in the Social Sciences Series, No. 140. Thousand Oaks, CA: Sage. pp. 39-41
Steve Verrill (2003) Confidence Bounds for Normal and Lognormal Distribution Coefficients of Variation, Research Paper 609, USDA Forest Products Laboratory, Madison, Wisconsin.
Verrill, S. and Johnson, R.A. (2007) Confidence Bounds and Hypothesis Tests for Normal Distribution Coefficients of Variation, Communications in Statistics Theory and Methods, Volume 36, No. 12, pp 2187-2206.
Mean
, SD
, (both supporting weights)
set.seed(15)
x <- runif(100)
CoefVar(x, conf.level=0.95)
# est low.ci upr.ci
# 0.5092566 0.4351644 0.6151409
# Coefficient of variation for a linear model
r.lm <- lm(Fertility ~ ., swiss)
CoefVar(r.lm)
# the function is vectorized, so arguments are recyled...
# https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/coefvacl.htm
CoefVarCI(K = 0.00246, n = 195, method="vangel",
sides="two.sided", conf.level = c(.5,.8,.9,.95,.99,.999))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.