# HH: The Hansen-Hurwitz Estimator In TeachingSampling: Selection of Samples and Parameter Estimation in Finite Population

## Description

Computes the Hansen-Hurwitz Estimator estimator of the population total for several variables of interest

## Usage

 1 HH(y, pk) 

## Arguments

 y Vector, matrix or data frame containing the recollected information of the variables of interest for every unit in the selected sample pk A vector containing selection probabilities for each unit in the selected sample

## Details

The Hansen-Hurwitz estimator is given by

∑_{i=1}^m\frac{y_i}{p_i}

where y_i is the value of the variables of interest for the ith unit, and p_i is its corresponding selection probability. This estimator is restricted to with replacement sampling designs.

## Value

The function returns a vector of total population estimates for each variable of interest, its estimated standard error and its estimated coefficient of variation.

## Author(s)

Hugo Andres Gutierrez Rojas hagutierrezro@gmail.com

## References

Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling. Springer.
Gutierrez, H. A. (2009), Estrategias de muestreo: Diseno de encuestas y estimacion de parametros. Editorial Universidad Santo Tomas.

HT

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 ############ ## Example 1 ############ # Vector U contains the label of a population of size N=5 U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie") # Vectors y1 and y2 give the values of the variables of interest y1<-c(32, 34, 46, 89, 35) y2<-c(1,1,1,0,0) y3<-cbind(y1,y2) # The population size is N=5 N <- length(U) # The sample size is m=2 m <- 2 # pk is the probability of selection of every single unit pk <- c(0.35, 0.225, 0.175, 0.125, 0.125) # Selection of a random sample with replacement sam <- sample(5,2, replace=TRUE, prob=pk) # The selected sample is U[sam] # The values of the variables of interest for the units in the sample y1[sam] y2[sam] y3[sam,] # The Hansen-Hurwitz estimator HH(y1[sam],pk[sam]) HH(y2[sam],pk[sam]) HH(y3[sam,],pk[sam]) ############ ## Example 2 ############ # Uses the Lucy data to draw a simple random sample with replacement data(Lucy) attach(Lucy) N <- dim(Lucy) m <- 400 sam <- sample(N,m,replace=TRUE) # The vector of selection probabilities of units in the sample pk <- rep(1/N,m) # The information about the units in the sample is stored in an object called data data <- Lucy[sam,] attach(data) names(data) # The variables of interest are: Income, Employees and Taxes # This information is stored in a data frame called estima estima <- data.frame(Income, Employees, Taxes) HH(estima, pk) ################################################################ ## Example 3 HH is unbiased for with replacement sampling designs ################################################################ # Vector U contains the label of a population of size N=5 U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie") # Vector y1 and y2 are the values of the variables of interest y<-c(32, 34, 46, 89, 35) # The population size is N=5 N <- length(U) # The sample size is m=2 m <- 2 # pk is the probability of selection of every single unit pk <- c(0.35, 0.225, 0.175, 0.125, 0.125) # p is the probability of selection of every possible sample p <- p.WR(N,m,pk) p sum(p) # The sample membership matrix for random size without replacement sampling designs Ind <- nk(N,m) Ind # The support with the values of the elements Qy <- SupportWR(N,m, ID=y) Qy # The support with the values of the elements Qp <- SupportWR(N,m, ID=pk) Qp # The HT estimates for every single sample in the support HH1 <- HH(Qy[1,], Qp[1,])[1,] HH2 <- HH(Qy[2,], Qp[2,])[1,] HH3 <- HH(Qy[3,], Qp[3,])[1,] HH4 <- HH(Qy[4,], Qp[4,])[1,] HH5 <- HH(Qy[5,], Qp[5,])[1,] HH6 <- HH(Qy[6,], Qp[6,])[1,] HH7 <- HH(Qy[7,], Qp[7,])[1,] HH8 <- HH(Qy[8,], Qp[8,])[1,] HH9 <- HH(Qy[9,], Qp[9,])[1,] HH10 <- HH(Qy[10,], Qp[10,])[1,] HH11 <- HH(Qy[11,], Qp[11,])[1,] HH12 <- HH(Qy[12,], Qp[12,])[1,] HH13 <- HH(Qy[13,], Qp[13,])[1,] HH14 <- HH(Qy[14,], Qp[14,])[1,] HH15 <- HH(Qy[15,], Qp[15,])[1,] # The HT estimates arranged in a vector Est <- c(HH1, HH2, HH3, HH4, HH5, HH6, HH7, HH8, HH9, HH10, HH11, HH12, HH13, HH14, HH15) Est # The HT is actually desgn-unbiased data.frame(Ind, Est, p) sum(Est*p) sum(y) 

### Example output

 "Yves" "Ken"
 32 34
 1 1
y1 y2
[1,] 32  1
[2,] 34  1
y
Estimation     121.26984
Standard Error  29.84127
CVE             24.60733
y
Estimation      3.6507937
Standard Error  0.7936508
CVE            21.7391304
y1         y2
Estimation     121.26984  3.6507937
Standard Error  29.84127  0.7936508
CVE             24.60733 21.7391304
The following objects are masked from Lucy:

Employees, ID, Income, Level, SPAM, Taxes, Ubication, Zone

 "ID"        "Ubication" "Level"     "Zone"      "Income"    "Employees"
 "Taxes"     "SPAM"
Income    Employees        Taxes
Estimation     1.000504e+06 1.509720e+05 26403.920000
Standard Error 3.029855e+04 3.920522e+03  1870.872910
CVE            3.028329e+00 2.596855e+00     7.085588
 0.122500 0.157500 0.122500 0.087500 0.087500 0.050625 0.078750 0.056250
 0.056250 0.030625 0.043750 0.043750 0.015625 0.031250 0.015625
 1
[,1] [,2] [,3] [,4] [,5]
[1,]    2    0    0    0    0
[2,]    1    1    0    0    0
[3,]    1    0    1    0    0
[4,]    1    0    0    1    0
[5,]    1    0    0    0    1
[6,]    0    2    0    0    0
[7,]    0    1    1    0    0
[8,]    0    1    0    1    0
[9,]    0    1    0    0    1
[10,]    0    0    2    0    0
[11,]    0    0    1    1    0
[12,]    0    0    1    0    1
[13,]    0    0    0    2    0
[14,]    0    0    0    1    1
[15,]    0    0    0    0    2
Warning message:
In if (ID == FALSE) { :
the condition has length > 1 and only the first element will be used
[,1] [,2]
[1,]   32   32
[2,]   32   34
[3,]   32   46
[4,]   32   89
[5,]   32   35
[6,]   34   34
[7,]   34   46
[8,]   34   89
[9,]   34   35
[10,]   46   46
[11,]   46   89
[12,]   46   35
[13,]   89   89
[14,]   89   35
[15,]   35   35
Warning message:
In if (ID == FALSE) { :
the condition has length > 1 and only the first element will be used
[,1]  [,2]
[1,] 0.350 0.350
[2,] 0.350 0.225
[3,] 0.350 0.175
[4,] 0.350 0.125
[5,] 0.350 0.125
[6,] 0.225 0.225
[7,] 0.225 0.175
[8,] 0.225 0.125
[9,] 0.225 0.125
[10,] 0.175 0.175
[11,] 0.175 0.125
[12,] 0.175 0.125
[13,] 0.125 0.125
[14,] 0.125 0.125
[15,] 0.125 0.125
  91.42857 121.26984 177.14286 401.71429 185.71429 151.11111 206.98413
 431.55556 215.55556 262.85714 487.42857 271.42857 712.00000 496.00000
 280.00000
X1 X2 X3 X4 X5       Est        p
1   2  0  0  0  0  91.42857 0.122500
2   1  1  0  0  0 121.26984 0.157500
3   1  0  1  0  0 177.14286 0.122500
4   1  0  0  1  0 401.71429 0.087500
5   1  0  0  0  1 185.71429 0.087500
6   0  2  0  0  0 151.11111 0.050625
7   0  1  1  0  0 206.98413 0.078750
8   0  1  0  1  0 431.55556 0.056250
9   0  1  0  0  1 215.55556 0.056250
10  0  0  2  0  0 262.85714 0.030625
11  0  0  1  1  0 487.42857 0.043750
12  0  0  1  0  1 271.42857 0.043750
13  0  0  0  2  0 712.00000 0.015625
14  0  0  0  1  1 496.00000 0.031250
15  0  0  0  0  2 280.00000 0.015625
 236
 236


TeachingSampling documentation built on April 22, 2020, 1:05 a.m.