Description Usage Arguments Details Value References See Also Examples
Produces estimates for class totals and proportions using multinomial logistic regression from survey data obtained from a dual frame sampling design using a model calibrated single frame approach with auxiliary information from the whole population. Confidence intervals are also computed, if required.
1 2 |
ysA |
A data frame containing information about one or more factors, each one of dimension n_A, collected from s_A. |
ysB |
A data frame containing information about one or more factors, each one of dimension n_B, collected from s_B. |
pik_A |
A numeric vector of length n_A containing first order inclusion probabilities for units included in s_A. |
pik_B |
A numeric vector of length n_B containing first order inclusion probabilities for units included in s_B. |
pik_ab_B |
A numeric vector of size n_A containing first order inclusion probabilities according to sampling design in frame B for units belonging to overlap domain that have been selected in s_A. |
pik_ba_A |
A numeric vector of size n_B containing first order inclusion probabilities according to sampling design in frame A for units belonging to overlap domain that have been selected in s_B. |
domains_A |
A character vector of size n_A indicating the domain each unit from s_A belongs to. Possible values are "a" and "ab". |
domains_B |
A character vector of size n_B indicating the domain each unit from s_B belongs to. Possible values are "b" and "ba". |
xsA |
A numeric vector of length n_A or a numeric matrix or data frame of dimensions n_A x m, with m the number of auxiliary variables, containing auxiliary information in frame A for units included in s_A. |
xsB |
A numeric vector of length n_B or a numeric matrix or data frame of dimensions n_B x m, with m the number of auxiliary variables, containing auxiliary information in frame B for units included in s_B. |
x |
A numeric vector or length N or a numeric matrix or data frame of dimensions N x m, with m the number of auxiliary variables, containing auxiliary information for every unit in the population. |
ind_sam |
A numeric vector of length n = n_A + n_B containing the identificators of units of the population (from 1 to N) that belongs to s_A or s_B |
N_A |
A numeric value indicating the size of frame A |
N_B |
A numeric value indicating the size of frame B |
N_ab |
(Optional) A numeric value indicating the size of the overlap domain |
met |
(Optional) A character vector indicating the distance that must be used in calibration process. Possible values are "linear", "raking" and "logit". Default is "linear". |
conf_level |
(Optional) A numeric value indicating the confidence level for the confidence intervals, if desired. |
Multinomial logistic calibration estimator in single frame using auxiliary information from the whole population for a proportion is given by
\hat{P}_{MLCi}^{SW} = \frac{1}{N} ≤ft(∑_{k \in s_A \cup s_B} \tilde{w}_k z_{ki}\right) \hspace{0.3cm} i = 1,...,m
with m the number of categories of the response variable, z_i the indicator variable for the i-th category of the response variable, and \tilde{w} calibration weights which are calculated having into account a different set of constraints, depending on the case. For instance, if N_A, N_B and N_{ab} are known, calibration constraints are
∑_{k \in s_a}\tilde{w}_k = N_a, ∑_{k \in s_{ab} \cup s_{ba}}\tilde{w}_k = N_{ab}, ∑_{k \in s_{ba}}\tilde{w}_k = N_{ba}
and
∑_{k \in s_A \cup s_B}\tilde{w}_k \tilde{p}_{ki} = ∑_{k \in U} \tilde{p}_{ki}
with
\tilde{p}_{ki} = \frac{exp(x_k^{'}\tilde{β_i})}{∑_{r=1}^m exp(x_k^{'}\tilde{β_r})},
being \tilde{β_i} the maximum likelihood parameters of the multinomial logistic model considering weights \tilde{d}_k =≤ft\{\begin{array}{lcc} d_k^A & \textrm{if } k \in a\\ (1/d_k^A + 1/d_k^B)^{-1} & \textrm{if } k \in ab \cup ba \\ d_k^B & \textrm{if } k \in b \end{array} \right..
MLCSW
returns an object of class "MultEstimatorDF" which is a list with, at least, the following components:
Call |
the matched call. |
Est |
class frequencies and proportions estimations for main variable(s). |
Molina, D., Rueda, M., Arcos, A. and Ranalli, M. G. (2015) Multinomial logistic estimation in dual frame surveys Statistics and Operations Research Transactions (SORT). To be printed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | data(DatMA)
data(DatMB)
data(DatPopM)
IndSample <- c(DatMA$Id_Pop, DatMB$Id_Pop)
N_FrameA <- nrow(DatPopM[DatPopM$Domain == "a" | DatPopM$Domain == "ab",])
N_FrameB <- nrow(DatPopM[DatPopM$Domain == "b" | DatPopM$Domain == "ab",])
N_Domainab <- nrow(DatPopM[DatPopM$Domain == "ab",])
#Let calculate proportions of categories of variable Prog using MLCSW estimator
#using Read as auxiliary variable
MLCSW(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$ProbB, DatMB$ProbA,
DatMA$Domain, DatMB$Domain, DatMA$Read, DatMB$Read, DatPopM$Read, IndSample, N_FrameA,
N_FrameB)
#Now, let suppose that the overlap domian size is known
MLCSW(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$ProbB, DatMB$ProbA,
DatMA$Domain, DatMB$Domain, DatMA$Read, DatMB$Read, DatPopM$Read, IndSample, N_FrameA,
N_FrameB, N_Domainab)
#Let obtain 95% confidence intervals together with the estimations
MLCSW(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$ProbB, DatMB$ProbA,
DatMA$Domain, DatMB$Domain, DatMA$Read, DatMB$Read, DatPopM$Read, IndSample, N_FrameA,
N_FrameB, N_Domainab, conf_level = 0.95)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.