# define_statistic_wrapper: Define a statistic wrapper In gustave: A User-Oriented Statistical Toolkit for Analytical Variance Estimation

 define_statistic_wrapper R Documentation

## Define a statistic wrapper

### Description

`define_statistic_wrapper` defines statistic wrappers to be used together with `variance estimation wrappers`. A statistic wrapper produces both the point estimator and the linearized variable associated with a given statistic to estimate variance on (Deville, 1999). `define_statistic_wrapper` is intended for advanced use only, standard statistic wrappers are included in the gustave package (see `standard statistic wrappers`)

### Usage

```define_statistic_wrapper(
statistic_function,
arg_type,
arg_not_affected_by_domain = NULL,
display_function = standard_display
)
```

### Arguments

 `statistic_function` An R function specific to the statistic to calculate. It should produce at least the point estimator and the linearized variable associated with the statistic (see Details). `arg_type` A named list with three character vectors describing the type of each argument of `statistic_function` (see Details). `arg_not_affected_by_domain` A character vector indicating the arguments which should not be affected by domain-splitting. Such parameters may appear in some complex linearization formula, for instance when the At-Risk of Poverty Rate (ARPR) is estimated by region but with a poverty line calculated at the national level. `display_function` An R function which produces, for each variance estimation, the data.frame to be displayed by the variance estimation wrapper. The default display function (`standard_display`) uses standard metadata to display usual variance indicator (point estimate, variance, standard deviation, coefficient of variation, confidence interval) broken down by statistic wrapper, domain (if any) and level (if the variable is a factor).

### Details

When the statistic to estimate is not a total, the application of analytical variance estimation formulae developed for the estimator of a total is not straightforward (Deville, 1999). An asymptotically unbiased variance estimator can nonetheless be obtained if the estimation of variance is performed on a variable obtained from the original data through a linearization step.

`define_statistic_wrapper` is the function used to create, for a given statistic, an easy-to-use function which calculates both the point estimator and the linearized variable associated with the statistic. These operations are implemented by the `statistic_function`, which can have any needed input (for example `num` and `denom` for a ratio estimator) and should output a list with at least two named elements:

• `point`: the point estimator of the statistic

• `lin`: the linearized variable to be passed on to the variance estimation formula. If several variables are to be associated with the statistics, `lin` can be a list itself.

All other named elements in the output of `define_statistic_wrapper` are treated as metadata (that may be used later on by `display_function`).

`arg_type` is a named list of three elements that describes the nature of the argument of `statistic_function`:

• `data`: data argument(s), numerical vector(s) to be used to calculate the point estimate and the linearized variable associated with the statistic

• `weight`: weight argument, numerical vector to be used as row weights

• `param`: parameters, non-data arguments to be used to control some aspect of the computation

Statistic wrappers are quite flexible tools to apply a variance function to an estimator requiring a linearization step (e.g. all estimators except the estimator of a total) with virtually no additional complexity for the end-user.

`standard statistic wrappers` are included within the gustave package and automatically added to the variance estimation wrappers. New statistic wrappers can be defined using the `define_statistic_wrapper` and then explicitly added to the variance estimation wrappers using the `objects_to_include` argument.

Note: To some extent, statistic wrappers can be seen as ggplot2 `geom_` and `stat_` functions: they help the end-user in writing down what he or she wants without having to go too deep into the details of the corresponding layers.

### Value

A function to be used within a variance estimation wrapper to estimate a specific statistic (see examples). Its formals are the ones of `statistic_function` with the addition of `by` and `where` (for domain estimation, set to `NULL` by default).

Martin Chevalier

### References

Deville J.-C. (1999), "Variance estimation for complex statistics and estimators: linearization and residual techniques", Survey Methodology, 25:193–203

`standard statistic wrappers`, `define_variance_wrapper`

### Examples

```### Example from the Information and communication technologies (ICT) survey

# Let's define a variance wrapper asfor the ICT survey
# as in the examples of the qvar function:
precision_ict <- qvar(
data = ict_sample,
dissemination_dummy = "dissemination",
dissemination_weight = "w_calib",
id = "firm_id",
scope_dummy = "scope",
sampling_weight = "w_sample",
strata = "strata",
nrc_weight = "w_nrc",
response_dummy = "resp",
hrg = "hrg",
calibration_weight = "w_calib",
calibration_var = c(paste0("N_", 58:63), paste0("turnover_", 58:63)),
define = TRUE
)
precision_ict(ict_survey, mean(speed_quanti))

# Let's now redefine the mean statistic wrapper
mean2 <- define_statistic_wrapper(
statistic_function = function(y, weight){
point <- sum(y * weight) / sum(weight)
lin <- (y - point) / sum(weight)
list(point = point, lin = lin, metadata = list(n = length(y)))
},
arg_type = list(data = "y", weight = "weight")
)

# mean2 can now be used inside precision_ict (and yields
# the same results as the mean statistic wrapper)
precision_ict(ict_survey, mean(speed_quanti), mean2(speed_quanti))

```

gustave documentation built on Sept. 19, 2022, 9:06 a.m.