jackknife_variance: Calculates Jackknife variance with reweighting for PSA

Description Usage Arguments Details Value References Examples

View source: R/NonProbEst.R

Description

Calculates the variance of PSA by Leave-One-Out Jackknife (Quenouille, 1956) with reweighting in each iteration.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
jackknife_variance(
  estimated_vars,
  convenience_sample,
  reference_sample,
  covariates,
  N = NULL,
  algorithm = "glm",
  smooth = FALSE,
  proc = NULL,
  trControl = trainControl(classProbs = TRUE),
  weighting.func = "sc",
  g = 5,
  calib = FALSE,
  calib_vars = NULL,
  totals = NULL,
  args.calib = NULL,
  ...
)

Arguments

estimated_vars

A string vector specifying the variables for which the estimators' variance are to be estimated.

convenience_sample

Data frame containing the non-probabilistic sample.

reference_sample

Data frame containing the probabilistic sample.

covariates

String vector specifying the common variables to use for training.

N

Integer indicating the population size. Optional.

algorithm

A string specifying which classification or regression model to use (same as caret's method). By default, its value is "glm" (logistic regression).

smooth

A logical value; if TRUE, propensity estimates pi_i are smoothed applying the formula (1000*pi_i + 0.5)/1001

proc

A string or vector of strings specifying if any of the data preprocessing techniques available in train function from 'caret' package should be applied to data prior to the propensity estimation. By default, its value is NULL and no preprocessing is applied.

trControl

A trainControl specifying the computational nuances of the train function.

weighting.func

A string specifying which function should be used to compute weights from propensity scores. Available functions are the following:

  • sc calls sc_weights.

  • valliant calls valliant_weights.

  • lee calls lee_weights.

  • vd calls vd_weights.

g

If weighting.func = "lee" or weighting.func = "vd", this element specifies the number of strata to use; by default, its value is 5.

calib

A logical value; if TRUE, PSA weights are used as initial weights for calibration. By default, its value is FALSE.

calib_vars

A string or vector of strings specifying the variables to be used for calibration. By default, its value is NULL.

totals

A vector containing population totals for each column (class) of the calibration variables matrix. Ignored if calib is set to FALSE.

args.calib

A list containing further arguments to be passed to the calib_weights function.

...

Further parameters to be passed to the train function.

Details

The estimation of the variance requires a recalculation of the estimates in each iteration which might involve weighting adjustments, leading to an increase in computation time. It is expected that the estimated variance captures the weighting adjustments' variability and the estimator's variability.

Value

The resulting variance.

References

Quenouille, M. H. (1956). Notes on bias in estimation. Biometrika, 43(3/4), 353-360.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#A simple example without calibration and default parameters
covariates = c("education_primaria", "education_secundaria")
jackknife_variance("vote_pens",sampleNP, sampleP, covariates)

#An example with linear calibration and default parameters
covariates = c("education_primaria", "education_secundaria")
calib_vars = c("age", "sex")
totals = c(2544377, 24284)

jackknife_variance("vote_pens",sampleNP, sampleP, covariates,
calib = T, calib_vars, totals, args.calib = list(method = "linear"))

NonProbEst documentation built on July 1, 2020, 6:08 p.m.