svymean_winsorized: Weighted Winsorized Mean and Total

View source: R/svymean_winsorized.R

svymean_winsorizedR Documentation

Weighted Winsorized Mean and Total

Description

Weighted winsorized mean and total

Usage

svymean_winsorized(x, design, LB = 0.05, UB = 1 - LB, na.rm = FALSE,
    trim_var = FALSE)
svymean_k_winsorized(x, design, k, na.rm = FALSE, trim_var = FALSE)
svytotal_winsorized(x, design, LB = 0.05, UB = 1 - LB, na.rm = FALSE,
    trim_var = FALSE)
svytotal_k_winsorized(x, design, k, na.rm = FALSE, trim_var = FALSE)

Arguments

x

a one-sided [formula], e.g., ~myVariable.

design

an object of class survey.design; see svydesign.

LB

[double] lower bound of winsorization such that 0 ≤q LB < UB ≤q 1.

UB

[double] upper bound of winsorization such that 0 ≤q LB < UB ≤q 1.

na.rm

[logical] indicating whether NA values should be removed before the computation proceeds (default: FALSE).

trim_var

[logical] indicating whether the variance should be approximated by the variance estimator of the trimmed mean/ total (default: FALSE).

k

[integer] number of observations to be winsorized at the top of the distribution.

Details

Package survey must be loaded in order to use the functions.

Characteristic.

Population mean or total. Let μ denote the estimated winsorized population mean; then, the estimated winsorized total is given by Nhat μ with Nhat = sum(w[i]), where summation is over all observations in the sample.

Modes of winsorization.

The amount of winsorization can be specified in relative or absolute terms:

  • Relative: By specifying LB and UB, the method winsorizes the LB~\cdot 100\% of the smallest observations and the (1 - UB)~\cdot 100\% of the largest observations from the data.

  • Absolute: By specifying argument k in the functions with the "infix" _k_ in their name (e.g., svymean_k_winsorized), the largest k observations are winsorized, 0<k<n, where n denotes the sample size. E.g., k = 2 implies that the largest and the second largest observation are winsorized.

Variance estimation.

Large-sample approximation based on the influence function; see Huber and Ronchetti (2009, Chap. 3.3) and Shao (1994). Two estimators are available:

simple_var = FALSE

Variance estimator of the winsorized mean/ total. The estimator depends on the estimated probability density function evaluated at the winsorization thresholds, which can be – depending on the context – numerically unstable. As a remedy, a simplified variance estimator is available by setting simple_var = TRUE.

simple_var = TRUE

Variance is approximated using the variance estimator of the trimmed mean/ total.

Utility functions.

summary, coef, SE, vcov, residuals, fitted and robweights.

Bare-bone functions.

See:

  • weighted_mean_winsorized,

  • weighted_mean_k_winsorized,

  • weighted_total_winsorized,

  • weighted_total_k_winsorized.

Value

Object of class svystat_rob

References

Huber, P. J. and Ronchetti, E. (2009). Robust Statistics, New York: John Wiley and Sons, 2nd edition. doi: 10.1002/9780470434697

Shao, J. (1994). L-Statistics in Complex Survey Problems. The Annals of Statistics 22, 976–967. doi: 10.1214/aos/1176325505

See Also

Overview (of all implemented functions)

weighted_mean_winsorized, weighted_mean_k_winsorized, weighted_total_winsorized and weighted_total_k_winsorized

Examples

data(workplace)

library(survey)
# Survey design for simple random sampling without replacement
dn <- svydesign(ids = ~ID, strata = ~strat, fpc = ~fpc, weights = ~weight,
    data = workplace)

# Estimated winsorized population mean (5% symmetric winsorization)
svymean_winsorized(~employment, dn, LB = 0.05)

# Estimated one-sided k winsorized population total (2 observations are
# winsorized at the top of the distribution)
svytotal_k_winsorized(~employment, dn, k = 2)

robsurvey documentation built on Jan. 6, 2023, 5:09 p.m.