direct: Direct Estimation of Disaggregated Indicators

View source: R/direct_estimation.R

directR Documentation

Direct Estimation of Disaggregated Indicators

Description

Function direct estimates indicators only based on sample information. The variance is estimated via a naive or calibrated bootstrap. The estimation is adapted from the estimation of direct indicators in package laeken.

Usage

direct(
  y,
  smp_data,
  smp_domains,
  weights = NULL,
  design = NULL,
  threshold = NULL,
  var = FALSE,
  boot_type = "naive",
  B = 50,
  seed = 123,
  X_calib = NULL,
  totals = NULL,
  custom_indicator = NULL,
  na.rm = FALSE
)

Arguments

y

a character string indicating the variable that is used for estimating the indicators. The variable must be contained in the sample data.

smp_data

survey data containing variable y as well as sampling domains, and weights if selected.

smp_domains

a character containing the name of a variable that indicates domains in the sample data. The variable must be numeric or a factor.

weights

a character string containing the name of a variable for the sampling weights in the sample data. This argument is optional and defaults to NULL.

design

a character string containing the name of a variable for different strata for stratified sampling designs. This argument is optional and defaults to NULL.

threshold

a number defining a threshold. Alternatively, a threshold may be defined as a function of y and weights returning a numeric value. Such a function will be evaluated once for the point estimation and in each iteration of the parametric bootstrap. See Example 2 for using a function as threshold. A threshold is needed for calculation e.g. of head count ratios and poverty gaps. The argument defaults to NULL. In this case, the threshold is set to 60% of the median of the variable that is selected as y similarly to the at-risk-of-poverty rate used in the EU (see also Social Protection Committee 2001). However, any desired threshold can be chosen.

var

if TRUE, estimates for the variance are calculated using a naive or calibrated bootstrap. Defaults to FALSE.

boot_type

a character string containing the name of the bootstrap specification. Either a "naive" or a "calibrate" bootstrap can be used. See also bootVar. Defaults to naive.

B

a number determining the number of bootstrap populations for the bootstrap variance. Defaults to 50.

seed

an integer to set the seed for the random number generator. Random number generation is used in the bootstrap approach. If seed is set to NULL, seed is chosen randomly. Defaults to 123.

X_calib

a numeric matrix including calibration variables if the calibrated bootstrap is chosen. Defaults to NULL.

totals

a numeric vector providing the population totals if the calibrated bootstrap is chosen. If a vector is chosen, the length of the vector needs to equal the number of columns in X_calib. Defaults to NULL. In this case, the sampling weights are used to calculate the totals.

custom_indicator

a list of functions containing the indicators to be calculated additionally. Such functions must and must only depend on the target variable y and optional can depend on weights and the threshold (numeric value) (see Example 3). Defaults to NULL.

na.rm

if TRUE, observations with NA values are deleted from the sample data. Defaults to FALSE.

Details

The set of predefined indicators includes the mean, median, four further quantiles (10%, 25%, 75% and 90%), head count ratio, poverty gap, Gini coefficient and the quintile share ratio.

Value

An object of class "direct", "emdi" that provides direct estimators for regional disaggregated indicators and optionally corresponding variance estimates. Several generic functions have methods for the returned object. For a full list and descriptions of the components of objects of class "emdi", see emdiObject.

References

Alfons, A. and Templ, M. (2013). Estimation of Social Exclusion Indicators from Complex Surveys: The R Package laeken. Journal of Statistical Software, 54(15), 1-25.

Social Protection Committee (2001). Report on Indicators in the Field of Poverty and Social Exclusions, Technical Report, European Union.

See Also

emdiObject, lme, estimators.emdi, emdi_summaries

Examples


# Loading sample data
data("eusilcA_smp")

# Example 1: With weights and naive bootstrap
emdi_direct <- direct(
  y = "eqIncome", smp_data = eusilcA_smp,
  smp_domains = "district", weights = "weight", threshold = 11064.82,
  var = TRUE, boot_type = "naive", B = 50, seed = 123, X_calib = NULL,
  totals = NULL, na.rm = TRUE
)

# Example 2: With function as threshold
emdi_direct <- direct(
  y = "eqIncome", smp_data = eusilcA_smp,
  smp_domains = "district", weights = "weight", threshold =
    function(y, weights) {
      0.6 * laeken::weightedMedian(y, weights)
    }, na.rm = TRUE
)

# Example 3: With custom indicators
emdi_direct <- direct(
  y = "eqIncome", smp_data = eusilcA_smp,
  smp_domains = "district", weights = "weight", threshold = 10859.24,
  var = TRUE, boot_type = "naive", B = 50, seed = 123, X_calib = NULL,
  totals = NULL, custom_indicator = list(my_max = function(y) {
    max(y)
  }, my_min = function(y) {
    min(y)
  }),
  na.rm = TRUE
)


emdi documentation built on Nov. 5, 2023, 5:07 p.m.