Description Usage Arguments Details Author(s) References Examples

This function returns the estimated design effects for a set of inclusion probabilities and the variables of interest.

1 | ```
DEFF(y, pik)
``` |

`y` |
Vector, matrix or data frame containing the recollected information of the variables of interest for every unit in the selected sample. |

`pik` |
Vector of inclusion probabilities for each unit in the selected sample. |

The design effect is somehow defined to be the ratio between the variance of a complex design and the variance of a simple design. When the design is stratified and the allocation is proportional, this measures reduces to

*DEFF_{Kish} = 1 + CV(w)*

where w is the set of weights (defined as the inverse of the inclusion probabilities) along the sample, and CV refers to the classical coefficient of variation. Although this measure is #' motivated by a stratified sampling design, it is commonly applied to any kind of survey where sampling weight are unequal. On the other hand, the Spencer's DEFF is motivated by the idea that a set of weights may be efficent even when they vary, and is defined by:

*DEFF_{Spencer} = (1 - R^2) * DEFF_{Kish} + \frac{\hat{a}^2}{\hat{σ}^2_y} * (DEFF_{Kish} - 1) *

where

*\hat{σ}^2_y = \frac{∑_s w_k (y_k - \bar{y}_w)^2}{∑_s w_k}*

and *\hat{a}* is the estimation of the intercept in the following model

*y_k = a + b * p_k + e_k*

with *p_k = π_k / n* is an standardized sampling weight. Finnaly, *R^2* is the R-squared of this model.

Hugo Andres Gutierrez Rojas <hugogutierrez at usantotomas.edu.co>

Gutierrez, H. A. (2009), *Estrategias de muestreo: Diseno de encuestas y estimacion de parametros*. Editorial Universidad Santo Tomas.
Valliant, R, et. al. (2013), *Practical tools for Design and Weighting Survey Samples*. Springer

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ```
#############################
# Example with BigLucy data #
#############################
data(BigLucy)
attach(BigLucy)
# The sample size
n <- 400
res <- S.piPS(n, Income)
sam <- res[,1]
# The information about the units in the sample is stored in an object called data
data <- BigLucy[sam,]
attach(data)
names(data)
# Pik.s is the inclusion probability of every single unit in the selected sample
pik <- res[,2]
# The variables of interest are: Income, Employees and Taxes
# This information is stored in a data frame called estima
estima <- data.frame(Income, Employees, Taxes)
E.piPS(estima,pik)
DEFF(estima,pik)
``` |

```
Loading required package: TeachingSampling
Loading required package: timeDate
The following objects are masked from BigLucy:
Employees, ID, ISO, Income, Level, SPAM, Segments, Taxes,
Ubication, Years, Zone
[1] "ID" "Ubication" "Level" "Zone" "Income" "Employees"
[7] "Taxes" "SPAM" "ISO" "Years" "Segments"
N Income Employees Taxes
Estimation 82237.403369 3.663473e+07 5.200414e+06 1.021006e+06
Standard Error 2592.313989 1.380486e-10 1.435342e+05 3.108280e+04
CVE 3.152232 3.768244e-16 2.760052e+00 3.044331e+00
DEFF Inf 1.009417e-32 8.385607e-01 5.898903e-02
DEFF.Kish DEFF.Spencer
N 1.399163 1.102916e+00
Income 1.399163 6.106359e-31
Employees 1.399163 6.219759e-01
Taxes 1.399163 9.014557e-01
Warning message:
In summary.lm(model.pk) :
essentially perfect fit: summary may be unreliable
```

samplesize4surveys documentation built on July 24, 2018, 9:04 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.