strata.bh: Stratification of a Population Given a Set of Boundaries In stratification: Univariate Stratification of Survey Populations

Description

The function `strata.bh` stratifies a population given a set of boundaries. It calculates the stratum sample sizes and the anticipated coefficient of variation or relative root mean squared error.

Usage

 ```1 2 3 4 5``` ```strata.bh(x, bh, n = NULL, CV = NULL, Ls = 3, certain = NULL, alloc = list(q1 = 0.5, q2 = 0, q3 = 0.5), takenone = 0, bias.penalty = 1, takeall = 0, takeall.adjust = TRUE, rh = rep(1, Ls), model = c("none", "loglinear", "linear", "random"), model.control = list()) ```

Arguments

 `x` A vector containing the values of the stratification variable X for every unit in the population. `bh` A vector of the L-1 stratum boundaries (b1, b2, ..., bL-1) where L is the total number of strata (excluding the certainty stratum, if any). Therefore, if `takenone=0` then L=`Ls`, and if `takenone=1` then L=`Ls`+1. `n` A numeric: the target sample size. It has no default value. The argument `n` or the argument `CV` must be input. `CV` A numeric: the target coefficient of variation or relative root mean squared error if `takenone`=1. It has no default value. The argument `CV` or the argument `n` must be input. `Ls` A numeric: the number of sampled strata (take-none and certain strata are not counted in `Ls`). The default is 3. `certain` A vector giving the position, in the vector `x`, of the units that must be included in the sample (see `stratification-package`). By default `certain` is `NULL`, which means that no units are a priori chosen to be in the sample. `alloc` A list specifying the allocation scheme. The list must contain 3 numerics for the 3 exponents `q1`, `q2` and `q3` in the general allocation scheme (see `stratification-package`). The default is Neyman allocation (`q1`=`q3`=0.5 and `q2`=0) `takenone` A numeric: the number of take-none strata (0 or 1). The default is 0, i.e. no take-none stratum is included. `bias.penalty` A numeric between 0 and 1 giving the penalty for the bias in the anticipated mean squared error (MSE) of the survey estimator (see `stratification-package`). This argument is relevant only if `takenone`=1. The default is 1. `takeall` A numeric: the number of take-all strata (one of {0, 1, ..., `Ls`-1}). The default is 0, i.e. no take-all stratum is included. `takeall.adjust` A logical. If `TRUE` (the default), when nh>Nh for a take-some stratum, the `takeall` argument is increased by one and the allocation is carried out again. This is done as long as nh<=Nh for every take-some stratum. If `FALSE`, no adjustment is made. Note: in other functions of the package stratification, this adjustment is not optional; it is made automatically (see `stratification-package`). `rh` A vector giving the anticipated response rates in each of the `Ls` sampled strata. A single number can be given if the rates do not vary among strata. The default is 1 in each stratum. `model` A character string identifying the model used to describe the discrepancy between the stratification variable X and the survey variable Y. It can be `"none"` if one assumes Y=X, `"loglinear"` for the loglinear model with mortality, `"linear"` for the heteroscedastic linear model or `"random"` for the random replacement model (see `stratification-package` for a description of these models). The default is `"none"`. `model.control` A list of model parameters (see `stratification-package`). The default values of the parameters correspond to the model Y=X.

Value

 `Nh ` A vector of length L containing the population sizes Nh, i.e. the number of units in each stratum. `nh ` A vector of length L containing the sample sizes nh, i.e. the number of units to sample in each stratum. See `stratification-package` for information about the rounding used to get these integer values. `n ` The total sample size (`sum(nh)`). `nhnonint ` A vector of length L containing the non-integer values of the sample sizes, obtained directly from applying the allocation rule (see `stratification-package`). `certain.info ` A vector giving statistics for the certainty stratum (see `stratification-package`). It contains `Nc`, the number of units chosen a priori to be in the sample, and `meanc`, the anticipated mean of Y for these units. `opti.nh ` The final value of the criteria to optimize (either the total sample size n if a target `CV` was given or the RRMSE if a target `n` was given) calculated with the integer stratum sample sizes `nh`. `opti.nhnonint ` The final value of the criteria to optimize (either the total sample size n if a target `CV` was given or the RRMSE if a target `n` was given) calculated with the non-integer stratum sample sizes `nhnonint`. `meanh ` A vector of length L containing the anticipated means of Y in each stratum. `varh ` A vector of length L containing the anticipated variances of Y in each stratum. `mean ` A numeric: the anticipated global mean value of Y. `RMSE ` A numeric: the root mean squared error (or standard error if `takenone`=0) of the anticipated global mean of Y. This is defined as the squared root of: (`bias.penalty` x bias of the mean)^2 + variance of the mean. `RRMSE ` A numeric: the anticipated relative root mean squared error (or coefficient of variation if `takenone`=0) for the mean of Y, i.e. `RMSE` divided by `mean`. `relativebias ` A numeric: the anticipated relative bias of the estimator, i.e. (`bias.penalty` x bias of the mean) divided by `mean`. If `takenone`=0, this numeric is zero. `propbiasMSE ` A numeric: the proportion of the MSE attributable to the bias of the estimator, i.e. (`bias.penalty` x bias of the mean)^2 divided by the MSE of the `mean`. If `takenone`=0, this numeric is zero. `stratumID` A factor, having the same length as the input `x`, which values are either 1, 2, ..., L or `"certain"`. The value `"certain"` is given to units a priori chosen to be in the sample. This factor identifies, for each observation, the stratum to which it has been assigned. `takeall ` The number of take-all strata in the final solution. Note: It is possible that n_h=N_h for non take-all strata because the condition for an automatic addition of a take-all stratum is n_h>N_h. `call ` The function call (object of class "call"). `date ` A character string that contains the system date and time when the function ended. `args ` A list of all the argument values input to the function or set by default.

Author(s)

Sophie Baillargeon Sophie.Baillargeon@mat.ulaval.ca and
Louis-Paul Rivest Louis-Paul.Rivest@mat.ulaval.ca

References

Baillargeon, S. and Rivest L.-P. (2011). The construction of stratified designs in R with the package stratification. Survey Methodology, 37(1), 53-65.

`print.strata`, `plot.strata`, `strata.cumrootf`, `strata.geo`, `strata.LH`

Examples

 ```1 2 3 4 5 6 7 8 9``` ```adjust <- strata.geo(x=USbanks, CV=0.01, Ls=4, alloc=c(0.35,0.35,0)) adjust adjust\$nhnonint noadjust <- strata.bh(x=USbanks, bh=adjust\$bh, CV=0.01, Ls=4, alloc=c(0.35,0.35,0), takeall=0, takeall.adjust=FALSE) noadjust noadjust\$nhnonint # without the adjustment for a take-all stratum, n is smaller than # with the adjustment, but the target CV is not reached. ```

Example output

```Given arguments:
x = USbanks
CV = 0.01, Ls = 4
allocation: q1 = 0.35, q2 = 0.35, q3 = 0
model = none

Strata information:
|      type rh |     bh   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 | 135.30 103.26   382.33 156  33 0.21
stratum 2 | take-some  1 | 261.51 178.28  1136.94 109  37 0.34
stratum 3 | take-some  1 | 505.47 371.86  5350.88  63  42 0.67
stratum 4 |  take-all  1 | 978.00 744.10 23105.75  29  29 1.00
Total                                             357 141 0.39

Total sample size: 141
Anticipated population mean: 225.6246
Anticipated CV: 0.009869827
[1] 32.05732 36.55587 41.66591 29.00000
Given arguments:
x = USbanks
CV = 0.01, Ls = 4, takenone = 0, takeall = 0
allocation: q1 = 0.35, q2 = 0.35, q3 = 0
model = none

Strata information:
|      type rh     bh |   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 135.30 | 103.26   382.33 156  29 0.19
stratum 2 | take-some  1 261.51 | 178.28  1136.94 109  34 0.31
stratum 3 | take-some  1 505.47 | 371.86  5350.88  63  38 0.60
stratum 4 | take-some  1 978.00 | 744.10 23105.75  29  29 1.00
Total                                             357 130 0.36

Total sample size: 130
Anticipated population mean: 225.6246
Anticipated CV: 0.01079707
Note: CV=RRMSE (Relative Root Mean Squared Error) because takenone=0.
[1] 28.98880 33.05675 37.67766 35.57166
```

stratification documentation built on May 1, 2019, 9:13 p.m.