Description Usage Arguments Details Value References See Also Examples
Produces estimates for class totals and proportions using multinomial logistic regression from survey data obtained from a dual frame sampling design using a model calibrated dual frame approach with a possibly different set of auxiliary variables for each frame. Confidence intervals are also computed, if required.
1 2 |
ysA |
A data frame containing information about one or more factors, each one of dimension n_A, collected from s_A. |
ysB |
A data frame containing information about one or more factors, each one of dimension n_B, collected from s_B. |
pik_A |
A numeric vector of length n_A containing first order inclusion probabilities for units included in s_A. |
pik_B |
A numeric vector of length n_B containing first order inclusion probabilities for units included in s_B. |
domains_A |
A character vector of size n_A indicating the domain each unit from s_A belongs to. Possible values are "a" and "ab". |
domains_B |
A character vector of size n_B indicating the domain each unit from s_B belongs to. Possible values are "b" and "ba". |
xsA |
A numeric vector of length n_A or a numeric matrix or data frame of dimensions n_A x m_A, with m_A the number of auxiliary variables in frame A, containing auxiliary information in frame A for units included in s_A. |
xsB |
A numeric vector of length n_B or a numeric matrix or data frame of dimensions n_B x m_B, with m_B the number of auxiliary variables in frame B, containing auxiliary information in frame B for units included in s_B. |
xA |
A numeric vector or length N_A or a numeric matrix or data frame of dimensions N_A x m_A, with m_A the number of auxiliary variables in frame A, containing auxiliary information for the units in frame A. |
xB |
A numeric vector or length N_B or a numeric matrix or data frame of dimensions N_B x m_B, with m_B the number of auxiliary variables in frame B, containing auxiliary information for the units in frame B. |
ind_samA |
A numeric vector of length n_A containing the identificators of units of the frame A (from 1 to N_A) that belongs to s_A. |
ind_samB |
A numeric vector of length n_B containing the identificators of units of the frame B (from 1 to N_B) that belongs to s_B. |
ind_domA |
A character vector of length N_A indicating the domain each unit from frame A belongs to. Possible values are "a" and "ab". |
ind_domB |
A character vector of length N_B indicating the domain each unit from frame B belongs to. Possible values are "b" and "ba". |
N |
A numeric value indicating the size of the population. |
N_ab |
(Optional) A numeric value indicating the size of the overlap domain |
met |
(Optional) A character vector indicating the distance that must be used in calibration process. Possible values are "linear", "raking" and "logit". Default is "linear". |
conf_level |
(Optional) A numeric value indicating the confidence level for the confidence intervals, if desired. |
Multinomial logistic calibration estimator in dual frame using auxiliary information from each frame for a proportion is given by
\hat{P}_{MLCi}^{DF} = \frac{1}{N} ≤ft(∑_{k \in s_A \cup s_B} w_k^{\circ} z_{ki}\right), \hspace{0.3cm} i = 1,...,m
with m the number of categories of the response variable, z_i the indicator variable for the i-th category of the response variable, and w^{\circ} calibration weights which are calculated having into account a different set of constraints, depending on the case. For instance, if N_A, N_B and N_{ab} are known, calibration constraints are
∑_{k \in s_a}w_k^{\circ} = N_a, ∑_{k \in s_{ab}}w_k^{\circ} = η N_{ab}, ∑_{k \in s_{ba}}w_k^{\circ} = (1 - η) N_{ab}∑_{k \in s_{b}}w_k^{\circ} = N_{b},
∑_{k \in s_A}w_k^\circ p_{ki}^A = ∑_{k \in U_a} p_{ki}^A + η ∑_{k \in U_{ab}} p_{ki}^A
and
∑_{k \in s_B}w_k^\circ p_{ki}^B = ∑_{k \in U_b} p_{ki}^B + (1 - η) ∑_{k \in U_{ba}} p_{ki}^B
with η \in (0,1) and
p_{ki}^A = \frac{exp(x_k^{'}β_i^A)}{∑_{r=1}^m exp(x_k^{'}β_r^A)},
being β_i^A the maximum likelihood parameters of the multinomial logistic model considering original design weights d^A. p_{ki}^B can be defined similarly.
MLCDF
returns an object of class "MultEstimatorDF" which is a list with, at least, the following components:
Call |
the matched call. |
Est |
class frequencies and proportions estimations for main variable(s). |
Molina, D., Rueda, M., Arcos, A. and Ranalli, M. G. (2015) Multinomial logistic estimation in dual frame surveys Statistics and Operations Research Transactions (SORT). To be printed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | data(DatMA)
data(DatMB)
data(DatPopM)
N <- nrow(DatPopM)
levels(DatPopM$Domain) <- c(levels(DatPopM$Domain), "ba")
DatPopMA <- subset(DatPopM, DatPopM$Domain == "a" | DatPopM$Domain == "ab", stringAsFactors = FALSE)
DatPopMB <- subset(DatPopM, DatPopM$Domain == "b" | DatPopM$Domain == "ab", stringAsFactors = FALSE)
DatPopMB[DatPopMB$Domain == "ab",]$Domain <- "ba"
#Let calculate proportions of categories of variable Prog using MLCDF estimator
#using Read as auxiliary variable
MLCDF(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$Domain, DatMB$Domain,
DatMA$Read, DatMB$Read, DatPopMA$Read, DatPopMB$Read, DatMA$Id_Frame, DatMB$Id_Frame,
DatPopMA$Domain, DatPopMB$Domain, N)
#Let obtain 95% confidence intervals together with the estimations
MLCDF(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$Domain, DatMB$Domain,
DatMA$Read, DatMB$Read, DatPopMA$Read, DatPopMB$Read, DatMA$Id_Frame, DatMB$Id_Frame,
DatPopMA$Domain, DatPopMB$Domain, N, conf_level = 0.95)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.