adjboxStats: Statistics for Skewness-adjusted Boxplots

View source: R/adjbox.R

adjboxStatsR Documentation

Statistics for Skewness-adjusted Boxplots

Description

Computes the “statistics” for producing boxplots adjusted for skewed distributions as proposed in Hubert and Vandervieren (2008), see adjbox.

Usage

adjboxStats(x, coef = 1.5, a = -4, b = 3, do.conf = TRUE, do.out = TRUE,
            ...)

Arguments

x

a numeric vector for which adjusted boxplot statistics are computed.

coef

number determining how far ‘whiskers’ extend out from the box, see boxplot.stats.

a, b

scaling factors multiplied by the medcouple mc() to determine outlyer boundaries; see the references.

do.conf,do.out

logicals; if FALSE, the conf or out component respectively will be empty in the result.

...

further optional arguments to be passed to mc(), such as doReflect.

Details

Given the quartiles Q1, Q3, the interquartile range IQR := Q3-Q1, and the medcouple M :=mc(x), c =coef, the “fence” is defined, for M ≥ 0 as

[Q1 - c*exp(a * M)*IQR, Q3 + c*exp(b * M)*IQR],

and for M < 0 as

[Q1 - c*exp(-b * M)*IQR, Q3 + c*exp(-a * M)*IQR],

and all observations x outside the fence, the “potential outliers”, are returned in out.

Note that a typo in robustbase version up to 0.7-8, for the (rare left-skewed) case where mc(x) < 0, lead to a “fence” not wide enough in the upper part, and hence less outliers there.

Value

A list with the components

stats

a vector of length 5, containing the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker.

n

the number of observations

conf

the lower and upper extremes of the ‘notch’ (if(do.conf)). See boxplot.stats.

fence

length 2 vector of interval boundaries which define the non-outliers, and hence the whiskers of the plot.

out

the values of any data points which lie beyond the fence, and hence beyond the extremes of the whiskers.

Note

The code only slightly modifies the code of R's boxplot.stats.

Author(s)

R Core Development Team (boxplot.stats); adapted by Tobias Verbeke and Martin Maechler.

See Also

adjbox(), also for references, the function which mainly uses this one; further boxplot.stats.

Examples

data(condroz)
adjboxStats(ccA <- condroz[,"Ca"])
adjboxStats(ccA, doReflect = TRUE)# small difference in fence

## Test reflection invariance [was not ok, up to and including robustbase_0.7-8]
a1 <- adjboxStats( ccA, doReflect = TRUE)
a2 <- adjboxStats(-ccA, doReflect = TRUE)

nm1 <- c("stats", "conf", "fence")
stopifnot(all.equal(       a1[nm1],
                    lapply(a2[nm1], function(u) rev(-u))),
          all.equal(a1[["out"]], -a2[["out"]]))

robustbase documentation built on April 3, 2022, 1:05 a.m.