PEL: Pseudo empirical likelihood estimator
In Frames2: Estimation in Dual Frame Surveys

Description Usage Arguments Details Value References See Also Examples

Produces estimates for population totals using the pseudo empirical likelihood estimator from survey data obtained from a dual frame sampling design. Confidence intervals for the population total are also computed, if required.

1
2
3

PEL(ysA, ysB, pi_A, pi_B, domains_A, domains_B, N_A = NULL, N_B = NULL, 
N_ab = NULL, xsAFrameA = NULL, xsBFrameA = NULL, xsAFrameB = NULL, xsBFrameB = NULL, 
XA = NULL, XB = NULL, conf_level = NULL)

`ysA`	A numeric vector of length n_A or a numeric matrix or data frame of dimensions n_A x c containing information about variable(s) of interest from s_A.
`ysB`	A numeric vector of length n_B or a numeric matrix or data frame of dimensions n_B x c containing information about variable(s) of interest from s_B.
`pi_A`	A numeric vector of length n_A or a square numeric matrix of dimension n_A containing first order or first and second order inclusion probabilities for units included in s_A.
`pi_B`	A numeric vector of length n_B or a square numeric matrix of dimension n_B containing first order or first and second order inclusion probabilities for units included in s_B.
`domains_A`	A character vector of size n_A indicating the domain each unit from s_A belongs to. Possible values are "a" and "ab".
`domains_B`	A character vector of size n_B indicating the domain each unit from s_B belongs to. Possible values are "b" and "ba".
`N_A`	(Optional) A numeric value indicating the size of frame A.
`N_B`	(Optional) A numeric value indicating the size of frame B.
`N_ab`	(Optional) A numeric value indicating the size of the overlap domain.
`xsAFrameA`	(Optional) A numeric vector of length n_A or a numeric matrix or data frame of dimensions n_A x m_A, with m_A the number of auxiliary variables in frame A, containing auxiliary information in frame A for units included in s_A.
`xsBFrameA`	(Optional) A numeric vector of length n_B or a numeric matrix or data frame of dimensions n_B x m_A, with m_A the number of auxiliary variables in frame A, containing auxiliary information in frame A for units included in s_B. For units in domain b, these values are 0.
`xsAFrameB`	(Optional) A numeric vector of length n_A or a numeric matrix or data frame of dimensions n_A x m_B, with m_B the number of auxiliary variables in frame B, containing auxiliary information in frame B for units included in s_A. For units in domain a, these values are 0.
`xsBFrameB`	(Optional) A numeric vector of length n_B or a numeric matrix or data frame of dimensions n_B x m_B, with m_B the number of auxiliary variables in frame B, containing auxiliary information in frame B for units included in s_B.
`XA`	(Optional) A numeric value or vector of length m_A, with m_A the number of auxiliary variables in frame A, indicating the population totals for the auxiliary variables considered in frame A.
`XB`	(Optional) A numeric value or vector of length m_B, with m_B the number of auxiliary variables in frame B, indicating the population totals for the auxiliary variables considered in frame B.
`conf_level`	(Optional) A numeric value indicating the confidence level for the confidence intervals, if desired.

Pseudo empirical likelihood estimator for the population mean is computed as

\hat{\bar{Y}}_{PEL} = \frac{N_a}{N}\hat{\bar{Y}}_a + \frac{η N_{ab}}{N}\hat{\bar{Y}}_{ab}^A + \frac{(1 - η) N_{ab}}{N}\hat{\bar{Y}}_{ab}^B + \frac{N_b}{N}\hat{\bar{Y}}_b

where \hat{\bar{Y}}_a = ∑_{k \in s_a}\hat{p}_{ak}y_k, \hat{\bar{Y}}_{ab} = ∑_{k \in s_{ab}^A}\hat{p}_{abk}^Ay_k, \hat{\bar{Y}}_{ab}^B = ∑_{k \in s_{ab}^B}\hat{p}_{abk}^By_k and \hat{\bar{Y}}_b = ∑_{k \in s_b}\hat{p}_{bk}y_k with \hat{p}_{ak}, \hat{p}_{abk}^A, \hat{p}_{abk}^B and \hat{p}_{bk} the weights resulting of applying the pseudo empirical likelihood procedure to a determined function under a determined set of constraints, depending on the case. Furthermore, η \in (0,1). In this case, N_A, N_B and N_{ab} have been supposed known and no additional auxiliary variables have been considered. This is not happening in some cases. Function covers following scenarios:

There is not any additional auxiliary variable
- N_A, N_B and N_{ab} unknown
- N_A and N_B known and N_{ab} unknown
- N_A, N_B and N_{ab} known
At least, one additional auxiliary variable is available
- N_A and N_B known and N_{ab} unknown
- N_A, N_B and N_{ab} known

Explicit variance of this estimator is not easy to obtain. Instead, confidence intervals can be computed through the bi-section method. This method constructs intervals in the form \{θ|r_{ns}(θ) < χ_1^2(α)\}, where χ_1^2(α) is the 1 - α quantile from a χ^2 distribution with one degree of freedom and r_{ns}(θ) represents the so called pseudo empirical log likelihood ratio statistic, which can be obtained as a difference of two pseudo empirical likelihood functions.

PEL returns an object of class "EstimatorDF" which is a list with, at least, the following components:

`Call`	the matched call.
`Est`	total and mean estimation for main variable(s).
`VarEst`	variance estimation for main variable(s).

If parameter conf_level is different from NULL, object includes component

ConfInt

total and mean estimation and confidence intervals for main variables(s).

In addition, components TotDomEst and MeanDomEst are available when estimator is based on estimators of the domains. Component Param shows value of parameters involded in calculation of the estimator (if any). By default, only Est component (or ConfInt component, if parameter conf_level is different from NULL) is shown. It is possible to access to all the components of the objects by using function summary.

Rao, J. N. K. and Wu, C. (2010) Pseudo Empirical Likelihood Inference for Multiple Frame Surveys. Journal of the American Statistical Association, 105, 1494 - 1503.

Wu, C. (2005) Algorithms and R codes for the pseudo empirical likelihood methods in survey sampling. Survey Methodology, Vol. 31, 2, pp. 239 - 243.

JackPEL

data(DatA)
data(DatB)
data(PiklA)
data(PiklB)

#Let calculate pseudo empirical likelihood estimator for variable Feeding, without
#considering any auxiliary information
PEL(DatA$Feed, DatB$Feed, PiklA, PiklB, DatA$Domain, DatB$Domain)

#Now, let calculate pseudo empirical estimator for variable Clothing when the frame
#sizes and the overlap domain size are known
PEL(DatA$Clo, DatB$Clo, PiklA, PiklB, DatA$Domain, DatB$Domain, 
N_A = 1735, N_B = 1191, N_ab = 601)

#Finally, let calculate pseudo empirical likelihood estimator and a 90% confidence interval
#for population total for variable Feeding, considering Income and Metres2 as auxiliary 
#variables and with frame sizes and overlap domain size known.
PEL(DatA$Feed, DatB$Feed, PiklA, PiklB, DatA$Domain, DatB$Domain, 
N_A = 1735, N_B =  1191, N_ab = 601, xsAFrameA = DatA$Inc, xsBFrameA = DatB$Inc, 
xsAFrameB = DatA$M2, xsBFrameB = DatB$M2, XA = 4300260, XB = 176553, 
conf_level = 0.90)

Attaching package: 'Frames2'

The following object is masked from 'package:methods':

    Compare


Estimation:
             [,1]
Total 590425.4869
Mean     247.4958

Estimation:
             [,1]
Total 70429.95642
Mean     30.29245

Estimation and  90 % Confidence Intervals:
                   [,1]
Total       572611.6997
Lower Bound 564723.7029
Upper Bound 582852.6234
Mean           246.2846
Lower Bound    242.8919
Upper Bound    250.6893