README.md

robsurvey

The package robsurvey provides several functions to compute robust survey statistics. The package supports the computations of robust means, totals, and ratios. Available methods are Huber M-estimators, trimming, and winsorization. robsurvey complements the famous survey package.

Overview

The following functions are provided in robsurvey:

Installation

You can install robsurvey from github with:

# install the latest version from CRAN
install.packages("robsurvey")

# or, you can install the latest development version from GitHub
devtools::install_github("martinSter/robsurvey")

Example

In the following example, we showcase a typical use of the package robsurvey. The data we use are from the package survey and describe the student performance in California schools. We will show different ways of how to compute a robust mean value for the Academic Performance Index (API) in 2000. The variable is denoted as api00. The following code chunk simply loads the data and defines the survey design (based on the survey package).

# load and attach packages
library(robsurvey)
library(survey)

# load the api dataset
data(api)

# define survey design
dstrat <- svydesign(id = ~1, strata = ~stype, weights = ~pw, 
                    data = apistrat, fpc = ~fpc)

In the following code chunk we first compute the robust Horvitz-Thompson M-estimator for the mean. In addition, we compute the trimmed and winsorized (robust) Horvitz-Thompson mean. Note how the estimates and the corresponding standard errors vary. The scale estimate used in the Huber-type robust M-estimator is the MAD, which is rescaled to be consistent at the normal distribution, i.e. multiplied by the constant 1.4826. The default tuning constant of the Huber-type Horvitz-Thompson M-estimator is (k=1.5).

# compute the robust Horvitz-Thompson M-estimator of the mean
svymean_huber(~api00, dstrat, k = 2)
#>          mean    SE
#> api00 662.907 8.926

# compute the robust trimmed Horvitz-Thompson mean
svymean_trimmed(~api00, dstrat, k = 2)
#>          mean     SE
#> api00 655.362 11.568

# compute the robust winsorized Horvitz-Thompson mean
svymean_winsorized(~api00, dstrat, k = 2)
#>          mean     SE
#> api00 640.599 11.568

It is also possible to use svymean_huber() in combination with svyby() from the survey package. The variable stype denotes the school level: elementary, middle, and high school.

# Domain estimates
svyby(~api00, by = ~stype, design = dstrat, svymean_huber, k = 1.34)
#>   stype    api00       se
#> E     E 675.8203  9.44767
#> H     H 629.1850 10.88119
#> M     M 635.1765 12.63996

For simulations and as intermediate results the above functions can also be used without the survey package. They deliver only the bare estimate.

# bare-bone function to compute robust Horvitz-Thompson M-estimator of the mean
weighted_mean_huber(apistrat$api00, weights(dstrat), k = 2)
#> [1] 662.9068

Acknowledgement

The implementation of this R package was supported by the Hasler foundation.



martinSter/robsurvey documentation built on Oct. 11, 2019, 4:45 p.m.