Description Usage Arguments Details Value References See Also Examples
Produces estimates for class totals and proportions using multinomial logistic regression from survey data obtained from a dual frame sampling design using a model calibrated dual frame approach with auxiliary information from the whole population. Confidence intervals are also computed, if required.
1 2 |
ysA |
A data frame containing information about one or more factors, each one of dimension n_A, collected from s_A. |
ysB |
A data frame containing information about one or more factors, each one of dimension n_B, collected from s_B. |
pik_A |
A numeric vector of length n_A containing first order inclusion probabilities for units included in s_A. |
pik_B |
A numeric vector of length n_B containing first order inclusion probabilities for units included in s_B. |
domains_A |
A character vector of size n_A indicating the domain each unit from s_A belongs to. Possible values are "a" and "ab". |
domains_B |
A character vector of size n_B indicating the domain each unit from s_B belongs to. Possible values are "b" and "ba". |
xsA |
A numeric vector of length n_A or a numeric matrix or data frame of dimensions n_A x m, with m the number of auxiliary variables, containing auxiliary information in frame A for units included in s_A. |
xsB |
A numeric vector of length n_B or a numeric matrix or data frame of dimensions n_B x m, with m the number of auxiliary variables, containing auxiliary information in frame B for units included in s_B. |
x |
A numeric vector or length N or a numeric matrix or data frame of dimensions N x m, with m the number of auxiliary variables, containing auxiliary information for every unit in the population. |
ind_sam |
A numeric vector of length n = n_A + n_B containing the identificators of units of the population (from 1 to N) that belongs to s_A or s_B |
N_A |
A numeric value indicating the size of frame A |
N_B |
A numeric value indicating the size of frame B |
N_ab |
(Optional) A numeric value indicating the size of the overlap domain |
met |
(Optional) A character vector indicating the distance that must be used in calibration process. Possible values are "linear", "raking" and "logit". Default is "linear". |
conf_level |
(Optional) A numeric value indicating the confidence level for the confidence intervals, if desired. |
Multinomial logistic calibration estimator in dual frame using auxiliary information from the whole population for a proportion is given by
\hat{P}_{MLCi}^{DW} = \frac{1}{N} ≤ft(∑_{k \in s_A \cup s_B} w_k^{\circ} z_{ki}\right), \hspace{0.3cm} i = 1,...,m
with m the number of categories of the response variable, z_i the indicator variable for the i-th category of the response variable, and w^{\circ} calibration weights which are calculated having into account a different set of constraints, depending on the case. For instance, if N_A, N_B and N_{ab} are known, calibration constraints are
∑_{k \in s_a}w_k^{\circ} = N_a, ∑_{k \in s_{ab}}w_k^{\circ} = η N_{ab}, ∑_{k \in s_{ba}}w_k^{\circ} = (1 - η) N_{ab}, ∑_{k \in s_{b}}w_k^{\circ} = N_{b}
and
∑_{k \in s_A \cup s_B}w_k^\circ p_{ki}^{\circ} = ∑_{k \in U} p_{ki}^\circ
with η \in (0,1) and
p_{ki}^{\circ} = \frac{exp(x_k^{'}β_i^{\circ})}{∑_{r=1}^m exp(x_k^{'}β_r^{\circ})},
being β_i^\circ the maximum likelihood parameters of the multinomial logistic model considering weights d_k^{\circ} =≤ft\{\begin{array}{lcc} d_k^A & \textrm{if } k \in a\\ η d_k^A & \textrm{if } k \in ab\\ (1 - η) d_k^B & \textrm{if } k \in ba \\ d_k^B & \textrm{if } k \in b \end{array} \right..
MLCDW
returns an object of class "MultEstimatorDF" which is a list with, at least, the following components:
Call |
the matched call. |
Est |
class frequencies and proportions estimations for main variable(s). |
Molina, D., Rueda, M., Arcos, A. and Ranalli, M. G. (2015) Multinomial logistic estimation in dual frame surveys Statistics and Operations Research Transactions (SORT). To be printed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | data(DatMA)
data(DatMB)
data(DatPopM)
IndSample <- c(DatMA$Id_Pop, DatMB$Id_Pop)
N_FrameA <- nrow(DatPopM[DatPopM$Domain == "a" | DatPopM$Domain == "ab",])
N_FrameB <- nrow(DatPopM[DatPopM$Domain == "b" | DatPopM$Domain == "ab",])
N_Domainab <- nrow(DatPopM[DatPopM$Domain == "ab",])
#Let calculate proportions of categories of variable Prog using MLCDW estimator
#using Read as auxiliary variable
MLCDW(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$Domain, DatMB$Domain,
DatMA$Read, DatMB$Read, DatPopM$Read, IndSample, N_FrameA, N_FrameB)
#Now, let suppose that the overlap domian size is known
MLCDW(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$Domain, DatMB$Domain,
DatMA$Read, DatMB$Read, DatPopM$Read, IndSample, N_FrameA, N_FrameB, N_Domainab)
#Let obtain 95% confidence intervals together with the estimations
MLCDW(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$Domain, DatMB$Domain,
DatMA$Read, DatMB$Read, DatPopM$Read, IndSample, N_FrameA, N_FrameB, N_Domainab,
conf_level = 0.95)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.