# CochranHorganData: Populations Analyzed in Gunning and Horgan (2004) and Cochran... In stratification: Univariate Stratification of Survey Populations

## Description

The first population `Debtors` is an accounting population of debtors in an Irish firm, detailed in Horgan (2003). The other populations are three of the skewed populations in Cochran (1961). These are:
`UScities`: the population in thousands of US cities in 1940;
`UScolleges`: the number of students in four-year US colleges in 1952-1953;
`USbanks`: the resources in millions of dollars of large commercial US banks.

## Usage

 ```1 2 3 4``` ```Debtors UScities UScolleges USbanks ```

## Format

The formats of these data sets are, respectively:
num [1:3369] 40 40 40 40 40 40 40 40 40 40 ...
num [1:1038] 10 10 10 10 10 10 10 10 10 10 ...
num [1:677] 200 201 202 202 207 210 211 213 215 217 ...
num [1:357] 70 71 72 72 72 73 73 73 73 73 ...

Jane M. Horgan

## References

Cochran, W.G. (1961). Comparison of methods for determining stratum boundaries. Bulletin of the International Statistical Institute, 32(2), 345-358.

Gunning, P. and Horgan, J.M. (2004). A new algorithm for the construction of stratum boundaries in skewed populations. Survey Methodology, 30(2), 159-166.

Horgan, J.M. (2003). A list sequential sampling scheme with applications in financial auditing. IMA Journal of Management Mathematics, 14, 1-18.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19``` ```### Reproduction of the results in Table 4 and Table 7 part 3 (case L=5) of ### Gunning and Horgan (2004). The differences in the nh come from different ### rounding. The more important differences observed for the cumulative ### root frequency method are due to the use of different numbers of classes. strata.geo(x=Debtors, n=100, Ls=5, alloc=c(0.5,0,0.5)) strata.cumrootf(x=Debtors, n=100, Ls=5, alloc=c(0.5,0,0.5), nclass=40) strata.LH(x=Debtors, CV=0.0360, Ls=5, alloc=c(0.5,0,0.5), takeall=1, algo="Sethi") strata.geo(x=UScities, n=100, Ls=5, alloc=c(0.5,0,0.5)) strata.cumrootf(x=UScities, n=100, Ls=5, alloc=c(0.5,0,0.5), nclass=40) strata.LH(x=UScities, CV=0.0144, Ls=5, alloc=c(0.5,0,0.5), takeall=1, algo="Sethi") strata.geo(x=UScolleges, n=100, Ls=5, alloc=c(0.5,0,0.5)) strata.cumrootf(x=UScolleges, n=100, Ls=5, alloc=c(0.5,0,0.5), nclass=40) strata.LH(x=UScolleges, CV=0.0184, Ls=5, alloc=c(0.5,0,0.5), takeall=1, algo="Sethi") strata.geo(x=USbanks, n=100, Ls=5, alloc=c(0.5,0,0.5)) strata.cumrootf(x=USbanks, n=100, Ls=5, alloc=c(0.5,0,0.5), nclass=40) strata.LH(x=USbanks, CV=0.0110, Ls=5, alloc=c(0.5,0,0.5), takeall=1, algo="Sethi") ```

### Example output

```Given arguments:
x = Debtors
n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |       bh     E(Y)      Var(Y)   Nh  nh   fh
stratum 1 | take-some  1 |   148.28    83.60      997.12 1054   3 0.00
stratum 2 | take-some  1 |   549.67   307.76    14063.78 1267  14 0.01
stratum 3 | take-some  1 |  2037.60  1008.41   163709.80  732  27 0.04
stratum 4 | take-some  1 |  7553.33  3702.59  1917482.45  265  33 0.12
stratum 5 | take-some  1 | 28001.00 12313.39 25748882.16   51  23 0.45
Total                                                    3369 100 0.03

Total sample size: 100
Anticipated population mean: 838.6388
Anticipated CV: 0.03585596
Given arguments:
x = Debtors
nclass = 40, n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |    bh     E(Y)     Var(Y)   Nh  nh   fh
stratum 1 | take-some  1 |   739   247.69    35243.4 2568  45 0.02
stratum 2 | take-some  1 |  1438   998.87    37091.7  352   6 0.02
stratum 3 | take-some  1 |  3535  2233.27   340565.4  284  15 0.05
stratum 4 | take-some  1 |  8428  5277.78  1581440.3  124  15 0.12
stratum 5 | take-some  1 | 28001 13390.66 26097549.6   41  19 0.46
Total                                                3369 100 0.03

Total sample size: 100
Anticipated population mean: 838.6388
Anticipated CV: 0.0357907
Given arguments:
x = Debtors
CV = 0.036, Ls = 5, takenone = 0, takeall = 1
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none
algo = Sethi: maxiter = 500

Strata information:
|      type rh  initbh |       bh     E(Y)      Var(Y)   Nh nh   fh
stratum 1 | take-some  1    99.0 |   349.33   146.49     7238.03 1856 13 0.01
stratum 2 | take-some  1   198.0 |  1190.18   626.26    46421.46  991 17 0.02
stratum 3 | take-some  1   410.0 |  3483.12  2013.98   377580.90  350 17 0.05
stratum 4 | take-some  1   888.8 | 10334.93  5589.95  2955057.91  146 20 0.14
stratum 5 |  take-all  1 28001.0 | 28001.00 15839.62 24644840.08   26 26 1.00
Total                                                            3369 93 0.03

Total sample size: 93
Anticipated population mean: 838.6388
Anticipated CV: 0.03512165
Note: CV=RRMSE (Relative Root Mean Squared Error) because takenone=0.
Given arguments:
x = UScities
n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |     bh   E(Y) Var(Y)   Nh  nh   fh
stratum 1 | take-some  1 |  18.17  14.18   7.21  364  18 0.05
stratum 2 | take-some  1 |  33.01  24.55  12.82  418  28 0.07
stratum 3 | take-some  1 |  59.98  43.55  46.77  130  17 0.13
stratum 4 | take-some  1 | 108.98  77.67 152.91   87  20 0.23
stratum 5 | take-some  1 | 199.00 153.05 569.07   39  17 0.44
Total                                           1038 100 0.10

Total sample size: 100
Anticipated population mean: 32.57418
Anticipated CV: 0.01445161
Given arguments:
x = UScities
nclass = 40, n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |    bh   E(Y) Var(Y)   Nh  nh   fh
stratum 1 | take-some  1 |  19.4  14.54   8.26  393  20 0.05
stratum 2 | take-some  1 |  28.8  24.02   6.30  336  15 0.04
stratum 3 | take-some  1 |  52.3  38.21  44.48  165  19 0.12
stratum 4 | take-some  1 |  94.6  70.90 137.07   94  19 0.20
stratum 5 | take-some  1 | 199.0 141.16 947.29   50  27 0.54
Total                                          1038 100 0.10

Total sample size: 100
Anticipated population mean: 32.57418
Anticipated CV: 0.01488882
Given arguments:
x = UScities
CV = 0.0144, Ls = 5, takenone = 0, takeall = 1
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none
algo = Sethi: maxiter = 500

Strata information:
|      type rh initbh |     bh   E(Y)  Var(Y)   Nh  nh   fh
stratum 1 | take-some  1     15 |  14.72  11.93    2.07  189   4 0.02
stratum 2 | take-some  1     20 |  21.62  17.80    3.58  270   8 0.03
stratum 3 | take-some  1     26 |  35.59  26.23   10.77  336  17 0.05
stratum 4 | take-some  1     40 |  80.29  51.27  151.69  164  30 0.18
stratum 5 |  take-all  1    199 | 199.00 120.66 1329.19   79  79 1.00
Total                                                   1038 138 0.13

Total sample size: 138
Anticipated population mean: 32.57418
Anticipated CV: 0.01414471
Note: CV=RRMSE (Relative Root Mean Squared Error) because takenone=0.
Given arguments:
x = UScolleges
n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |      bh    E(Y)     Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 |  434.00  305.13    4501.30  94   3 0.03
stratum 2 | take-some  1 |  941.76  674.31   20621.59 255  15 0.06
stratum 3 | take-some  1 | 2043.61 1315.09  102487.82 198  27 0.14
stratum 4 | take-some  1 | 4434.60 2961.50  387698.22  74  20 0.27
stratum 5 | take-some  1 | 9624.00 6749.70 2138520.32  56  35 0.62
Total                                                 677 100 0.15

Total sample size: 100
Anticipated population mean: 1563
Anticipated CV: 0.018296
Given arguments:
x = UScolleges
nclass = 40, n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |      bh    E(Y)     Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 |  671.15  448.41   19168.34 224  15 0.07
stratum 2 | take-some  1 | 1377.88  955.24   31216.18 254  21 0.08
stratum 3 | take-some  1 | 2791.33 1951.69  159263.61 103  19 0.18
stratum 4 | take-some  1 | 5618.22 3931.69  824511.35  58  25 0.43
stratum 5 | take-some  1 | 9624.00 7526.66 1235296.65  38  20 0.53
Total                                                 677 100 0.15

Total sample size: 100
Anticipated population mean: 1563
Anticipated CV: 0.01705522
Given arguments:
x = UScolleges
CV = 0.0184, Ls = 5, takenone = 0, takeall = 1
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none
algo = Sethi: maxiter = 500

Strata information:
|      type rh initbh |      bh    E(Y)     Var(Y)  Nh  nh   fh
stratum 1 | take-some  1  524.4 |  512.32  354.18    9111.04 133   4 0.03
stratum 2 | take-some  1  763.4 |  869.76  673.31   10190.63 180   6 0.03
stratum 3 | take-some  1 1080.6 | 1577.36 1103.98   30823.03 185  10 0.05
stratum 4 | take-some  1 1977.0 | 3675.52 2314.81  293595.46 110  18 0.16
stratum 5 |  take-all  1 9624.0 | 9624.00 6246.14 2835174.47  69  69 1.00
Total                                                        677 107 0.16

Total sample size: 107
Anticipated population mean: 1563
Anticipated CV: 0.01785935
Note: CV=RRMSE (Relative Root Mean Squared Error) because takenone=0.
Given arguments:
x = USbanks
n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |     bh   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 | 118.59  93.92   192.23 114  13 0.11
stratum 2 | take-some  1 | 200.92 147.58   429.18 116  20 0.17
stratum 3 | take-some  1 | 340.39 258.03  2027.22  64  25 0.39
stratum 4 | take-some  1 | 576.68 442.90  2929.73  39  18 0.46
stratum 5 |  take-all  1 | 978.00 788.96 16224.79  24  24 1.00
Total                                             357 100 0.28

Total sample size: 100
Anticipated population mean: 225.6246
Anticipated CV: 0.01071521
Given arguments:
x = USbanks
nclass = 40, n = 100, Ls = 5
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none

Strata information:
|      type rh |     bh   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 | 115.35  93.05   177.82 110  12 0.11
stratum 2 | take-some  1 | 183.38 142.05   260.83 109  15 0.14
stratum 3 | take-some  1 | 319.43 240.01  1787.25  68  24 0.35
stratum 4 | take-some  1 | 523.50 415.26  3361.67  42  21 0.50
stratum 5 |  take-all  1 | 978.00 752.39 21938.24  28  28 1.00
Total                                             357 100 0.28

Total sample size: 100
Anticipated population mean: 225.6246
Anticipated CV: 0.01040122
Given arguments:
x = USbanks
CV = 0.011, Ls = 5, takenone = 0, takeall = 1
allocation: q1 = 0.5, q2 = 0, q3 = 0.5
model = none
algo = Sethi: maxiter = 500

Strata information:
|      type rh initbh |     bh   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1  100.0 |  99.54  84.84    81.79  70   4 0.06
stratum 2 | take-some  1  131.4 | 130.83 114.40    87.71  68   4 0.06
stratum 3 | take-some  1  176.0 | 189.70 149.91   252.96  85   9 0.11
stratum 4 | take-some  1  318.0 | 339.31 251.83  2179.66  71  21 0.30
stratum 5 |  take-all  1  978.0 | 978.00 574.73 36236.80  63  63 1.00
Total                                                    357 101 0.28

Total sample size: 101
Anticipated population mean: 225.6246
Anticipated CV: 0.01067923
Note: CV=RRMSE (Relative Root Mean Squared Error) because takenone=0.
```

stratification documentation built on May 1, 2019, 9:13 p.m.