HH: The Hansen-Hurwitz Estimator

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/HH.r

Description

Computes the Hansen-Hurwitz Estimator estimator of the population total for several variables of interest

Usage

1
HH(y, pk)

Arguments

y

Vector, matrix or data frame containing the recollected information of the variables of interest for every unit in the selected sample

pk

A vector containing selection probabilities for each unit in the selected sample

Details

The Hansen-Hurwitz estimator is given by

∑_{i=1}^m\frac{y_i}{p_i}

where y_i is the value of the variables of interest for the ith unit, and p_i is its corresponding selection probability. This estimator is restricted to with replacement sampling designs.

Value

The function returns a vector of total population estimates for each variable of interest, its estimated standard error and its estimated coefficient of variation.

Author(s)

Hugo Andres Gutierrez Rojas hagutierrezro@gmail.com

References

Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling. Springer.
Gutierrez, H. A. (2009), Estrategias de muestreo: Diseno de encuestas y estimacion de parametros. Editorial Universidad Santo Tomas.

See Also

HT

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
############
## Example 1
############
# Vector U contains the label of a population of size N=5
U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie")
# Vectors y1 and y2 give the values of the variables of interest
y1<-c(32, 34, 46, 89, 35)
y2<-c(1,1,1,0,0)
y3<-cbind(y1,y2)
# The population size is N=5
N <- length(U)
# The sample size is m=2
m <- 2
# pk is the probability of selection of every single unit
pk <- c(0.35, 0.225, 0.175, 0.125, 0.125)
# Selection of a random sample with replacement
sam <- sample(5,2, replace=TRUE, prob=pk)
# The selected sample is
U[sam]
# The values of the variables of interest for the units in the sample
y1[sam]
y2[sam]
y3[sam,]
# The Hansen-Hurwitz estimator
HH(y1[sam],pk[sam])
HH(y2[sam],pk[sam])
HH(y3[sam,],pk[sam])


############
## Example 2
############
# Uses the Lucy data to draw a simple random sample with replacement
data(Lucy)
attach(Lucy)

N <- dim(Lucy)[1]
m <- 400
sam <- sample(N,m,replace=TRUE)
# The vector of selection probabilities of units in the sample
pk <- rep(1/N,m)
# The information about the units in the sample is stored in an object called data
data <- Lucy[sam,]
attach(data)
names(data)
# The variables of interest are: Income, Employees and Taxes
# This information is stored in a data frame called estima
estima <- data.frame(Income, Employees, Taxes)
HH(estima, pk)

################################################################
## Example 3 HH is unbiased for with replacement sampling designs
################################################################

# Vector U contains the label of a population of size N=5
U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie")
# Vector y1 and y2 are the values of the variables of interest
y<-c(32, 34, 46, 89, 35)
# The population size is N=5
N <- length(U)
# The sample size is m=2
m <- 2
# pk is the probability of selection of every single unit
pk <- c(0.35, 0.225, 0.175, 0.125, 0.125)
# p is the probability of selection of every possible sample
p <- p.WR(N,m,pk)
p
sum(p)
# The sample membership matrix for random size without replacement sampling designs
Ind <- nk(N,m)
Ind
# The support with the values of the elements
Qy <- SupportWR(N,m, ID=y)                 
Qy
# The support with the values of the elements
Qp <- SupportWR(N,m, ID=pk)                 
Qp
# The HT estimates for every single sample in the support
HH1 <- HH(Qy[1,], Qp[1,])[1,]
HH2 <- HH(Qy[2,], Qp[2,])[1,]
HH3 <- HH(Qy[3,], Qp[3,])[1,] 
HH4 <- HH(Qy[4,], Qp[4,])[1,] 
HH5 <- HH(Qy[5,], Qp[5,])[1,] 
HH6 <- HH(Qy[6,], Qp[6,])[1,] 
HH7 <- HH(Qy[7,], Qp[7,])[1,]
HH8 <- HH(Qy[8,], Qp[8,])[1,]
HH9 <- HH(Qy[9,], Qp[9,])[1,]
HH10 <- HH(Qy[10,], Qp[10,])[1,]
HH11 <- HH(Qy[11,], Qp[11,])[1,]
HH12 <- HH(Qy[12,], Qp[12,])[1,]
HH13 <- HH(Qy[13,], Qp[13,])[1,]
HH14 <- HH(Qy[14,], Qp[14,])[1,]
HH15 <- HH(Qy[15,], Qp[15,])[1,]
# The HT estimates arranged in a vector
Est <- c(HH1, HH2, HH3, HH4, HH5, HH6, HH7, HH8, HH9, HH10, HH11, HH12, HH13,
HH14, HH15)
Est
# The HT is actually desgn-unbiased
data.frame(Ind, Est, p)
sum(Est*p)
sum(y)

Example output

[1] "Yves" "Ken" 
[1] 32 34
[1] 1 1
     y1 y2
[1,] 32  1
[2,] 34  1
                       y
Estimation     121.26984
Standard Error  29.84127
CVE             24.60733
                        y
Estimation      3.6507937
Standard Error  0.7936508
CVE            21.7391304
                      y1         y2
Estimation     121.26984  3.6507937
Standard Error  29.84127  0.7936508
CVE             24.60733 21.7391304
The following objects are masked from Lucy:

    Employees, ID, Income, Level, SPAM, Taxes, Ubication, Zone

[1] "ID"        "Ubication" "Level"     "Zone"      "Income"    "Employees"
[7] "Taxes"     "SPAM"     
                     Income    Employees        Taxes
Estimation     1.000504e+06 1.509720e+05 26403.920000
Standard Error 3.029855e+04 3.920522e+03  1870.872910
CVE            3.028329e+00 2.596855e+00     7.085588
 [1] 0.122500 0.157500 0.122500 0.087500 0.087500 0.050625 0.078750 0.056250
 [9] 0.056250 0.030625 0.043750 0.043750 0.015625 0.031250 0.015625
[1] 1
      [,1] [,2] [,3] [,4] [,5]
 [1,]    2    0    0    0    0
 [2,]    1    1    0    0    0
 [3,]    1    0    1    0    0
 [4,]    1    0    0    1    0
 [5,]    1    0    0    0    1
 [6,]    0    2    0    0    0
 [7,]    0    1    1    0    0
 [8,]    0    1    0    1    0
 [9,]    0    1    0    0    1
[10,]    0    0    2    0    0
[11,]    0    0    1    1    0
[12,]    0    0    1    0    1
[13,]    0    0    0    2    0
[14,]    0    0    0    1    1
[15,]    0    0    0    0    2
Warning message:
In if (ID == FALSE) { :
  the condition has length > 1 and only the first element will be used
      [,1] [,2]
 [1,]   32   32
 [2,]   32   34
 [3,]   32   46
 [4,]   32   89
 [5,]   32   35
 [6,]   34   34
 [7,]   34   46
 [8,]   34   89
 [9,]   34   35
[10,]   46   46
[11,]   46   89
[12,]   46   35
[13,]   89   89
[14,]   89   35
[15,]   35   35
Warning message:
In if (ID == FALSE) { :
  the condition has length > 1 and only the first element will be used
       [,1]  [,2]
 [1,] 0.350 0.350
 [2,] 0.350 0.225
 [3,] 0.350 0.175
 [4,] 0.350 0.125
 [5,] 0.350 0.125
 [6,] 0.225 0.225
 [7,] 0.225 0.175
 [8,] 0.225 0.125
 [9,] 0.225 0.125
[10,] 0.175 0.175
[11,] 0.175 0.125
[12,] 0.175 0.125
[13,] 0.125 0.125
[14,] 0.125 0.125
[15,] 0.125 0.125
 [1]  91.42857 121.26984 177.14286 401.71429 185.71429 151.11111 206.98413
 [8] 431.55556 215.55556 262.85714 487.42857 271.42857 712.00000 496.00000
[15] 280.00000
   X1 X2 X3 X4 X5       Est        p
1   2  0  0  0  0  91.42857 0.122500
2   1  1  0  0  0 121.26984 0.157500
3   1  0  1  0  0 177.14286 0.122500
4   1  0  0  1  0 401.71429 0.087500
5   1  0  0  0  1 185.71429 0.087500
6   0  2  0  0  0 151.11111 0.050625
7   0  1  1  0  0 206.98413 0.078750
8   0  1  0  1  0 431.55556 0.056250
9   0  1  0  0  1 215.55556 0.056250
10  0  0  2  0  0 262.85714 0.030625
11  0  0  1  1  0 487.42857 0.043750
12  0  0  1  0  1 271.42857 0.043750
13  0  0  0  2  0 712.00000 0.015625
14  0  0  0  1  1 496.00000 0.031250
15  0  0  0  0  2 280.00000 0.015625
[1] 236
[1] 236

TeachingSampling documentation built on April 22, 2020, 1:05 a.m.