direct: Direct Estimation of Disaggregated Indicators
In SoerenPannier/emdi: Estimating and Mapping Disaggregated Indicators

direct

R Documentation

Direct Estimation of Disaggregated Indicators

Description

Function direct estimates indicators only based on sample information. The variance is estimated via a naive or calibrated bootstrap. The estimation is adapted from the estimation of direct indicators in package laeken.

Usage

direct(
  y,
  smp_data,
  smp_domains,
  weights = NULL,
  design = NULL,
  threshold = NULL,
  var = FALSE,
  boot_type = "naive",
  B = 50,
  seed = 123,
  X_calib = NULL,
  totals = NULL,
  custom_indicator = NULL,
  na.rm = FALSE
)

Arguments

`y`	a character string indicating the variable that is used for estimating the indicators. The variable must be contained in the sample data.
`smp_data`	survey data containing variable y as well as sampling domains, and weights if selected.
`smp_domains`	a character containing the name of a variable that indicates domains in the sample data. The variable must be numeric or a factor.
`weights`	a character string containing the name of a variable for the sampling weights in the sample data. This argument is optional and defaults to `NULL`.
`design`	a character string containing the name of a variable for different strata for stratified sampling designs. This argument is optional and defaults to `NULL`.
`threshold`	a number defining a threshold. Alternatively, a threshold may be defined as a `function` of `y` and `weights` returning a numeric value. Such a function will be evaluated once for the point estimation and in each iteration of the parametric bootstrap. See Example 2 for using a function as threshold. A threshold is needed for calculation e.g. of head count ratios and poverty gaps. The argument defaults to `NULL`. In this case, the threshold is set to 60% of the median of the variable that is selected as `y` similarly to the at-risk-of-poverty rate used in the EU (see also Social Protection Committee 2001). However, any desired threshold can be chosen.
`var`	if `TRUE`, estimates for the variance are calculated using a naive or calibrated bootstrap. Defaults to `FALSE`.
`boot_type`	a character string containing the name of the bootstrap specification. Either a `"naive"` or a `"calibrate"` bootstrap can be used. See also `bootVar`. Defaults to `naive`.
`B`	a number determining the number of bootstrap populations for the bootstrap variance. Defaults to `50`.
`seed`	an integer to set the seed for the random number generator. Random number generation is used in the bootstrap approach. If seed is set to `NULL`, seed is chosen randomly. Defaults to `123`.
`X_calib`	a numeric matrix including calibration variables if the calibrated bootstrap is chosen. Defaults to NULL.
`totals`	a numeric vector providing the population totals if the calibrated bootstrap is chosen. If a vector is chosen, the length of the vector needs to equal the number of columns in X_calib. Defaults to `NULL`. In this case, the sampling weights are used to calculate the totals.
`custom_indicator`	a list of functions containing the indicators to be calculated additionally. Such functions must and must only depend on the target variable `y` and optional can depend on `weights` and the `threshold` (numeric value) (see Example 3). Defaults to `NULL`.
`na.rm`	if `TRUE`, observations with `NA` values are deleted from the sample data. Defaults to `FALSE`.

Details

The set of predefined indicators includes the mean, median, four further quantiles (10%, 25%, 75% and 90%), head count ratio, poverty gap, Gini coefficient and the quintile share ratio.

Value

An object of class "direct", "emdi" that provides direct estimators for regional disaggregated indicators and optionally corresponding variance estimates. Several generic functions have methods for the returned object. For a full list and descriptions of the components of objects of class "emdi", see emdiObject.

References

Alfons, A. and Templ, M. (2013). Estimation of Social Exclusion Indicators from Complex Surveys: The R Package laeken. Journal of Statistical Software, 54(15), 1-25.

Social Protection Committee (2001). Report on Indicators in the Field of Poverty and Social Exclusions, Technical Report, European Union.

Examples


# Loading sample data
data("eusilcA_smp")

# Example 1: With weights and naive bootstrap
emdi_direct <- direct(
  y = "eqIncome", smp_data = eusilcA_smp,
  smp_domains = "district", weights = "weight", threshold = 11064.82,
  var = TRUE, boot_type = "naive", B = 50, seed = 123, X_calib = NULL,
  totals = NULL, na.rm = TRUE
)

# Example 2: With function as threshold
emdi_direct <- direct(
  y = "eqIncome", smp_data = eusilcA_smp,
  smp_domains = "district", weights = "weight", threshold =
    function(y, weights) {
      0.6 * laeken::weightedMedian(y, weights)
    }, na.rm = TRUE
)

# Example 3: With custom indicators
emdi_direct <- direct(
  y = "eqIncome", smp_data = eusilcA_smp,
  smp_domains = "district", weights = "weight", threshold = 10859.24,
  var = TRUE, boot_type = "naive", B = 50, seed = 123, X_calib = NULL,
  totals = NULL, custom_indicator = list(my_max = function(y) {
    max(y)
  }, my_min = function(y) {
    min(y)
  }),
  na.rm = TRUE
)

SoerenPannier/emdi documentation built on Nov. 2, 2023, 7:54 p.m.