strata.bh: Stratification of a Population Given a Set of Boundaries

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/strata.bh.R

Description

The function strata.bh stratifies a population given a set of boundaries. It calculates the stratum sample sizes and the anticipated coefficient of variation or relative root mean squared error.

Usage

1
2
3
4
5
strata.bh(x, bh, n = NULL, CV = NULL, Ls = 3, certain = NULL,
          alloc = list(q1 = 0.5, q2 = 0, q3 = 0.5), takenone = 0, 
          bias.penalty = 1, takeall = 0, takeall.adjust = TRUE, 
          rh = rep(1, Ls), model = c("none", "loglinear", "linear",
          "random"), model.control = list())      

Arguments

x

A vector containing the values of the stratification variable X for every unit in the population.

bh

A vector of the L-1 stratum boundaries (b1, b2, ..., bL-1) where L is the total number of strata (excluding the certainty stratum, if any). Therefore, if takenone=0 then L=Ls, and if takenone=1 then L=Ls+1.

n

A numeric: the target sample size. It has no default value. The argument n or the argument CV must be input.

CV

A numeric: the target coefficient of variation or relative root mean squared error if takenone=1. It has no default value. The argument CV or the argument n must be input.

Ls

A numeric: the number of sampled strata (take-none and certain strata are not counted in Ls). The default is 3.

certain

A vector giving the position, in the vector x, of the units that must be included in the sample (see stratification-package). By default certain is NULL, which means that no units are a priori chosen to be in the sample.

alloc

A list specifying the allocation scheme. The list must contain 3 numerics for the 3 exponents q1, q2 and q3 in the general allocation scheme (see stratification-package). The default is Neyman allocation (q1=q3=0.5 and q2=0)

takenone

A numeric: the number of take-none strata (0 or 1). The default is 0, i.e. no take-none stratum is included.

bias.penalty

A numeric between 0 and 1 giving the penalty for the bias in the anticipated mean squared error (MSE) of the survey estimator (see stratification-package). This argument is relevant only if takenone=1. The default is 1.

takeall

A numeric: the number of take-all strata (one of {0, 1, ..., Ls-1}). The default is 0, i.e. no take-all stratum is included.

takeall.adjust

A logical. If TRUE (the default), when nh>Nh for a take-some stratum, the takeall argument is increased by one and the allocation is carried out again. This is done as long as nh<=Nh for every take-some stratum. If FALSE, no adjustment is made. Note: in other functions of the package stratification, this adjustment is not optional; it is made automatically (see stratification-package).

rh

A vector giving the anticipated response rates in each of the Ls sampled strata. A single number can be given if the rates do not vary among strata. The default is 1 in each stratum.

model

A character string identifying the model used to describe the discrepancy between the stratification variable X and the survey variable Y. It can be "none" if one assumes Y=X, "loglinear" for the loglinear model with mortality, "linear" for the heteroscedastic linear model or "random" for the random replacement model (see stratification-package for a description of these models). The default is "none".

model.control

A list of model parameters (see stratification-package). The default values of the parameters correspond to the model Y=X.

Value

Nh

A vector of length L containing the population sizes Nh, i.e. the number of units in each stratum.

nh

A vector of length L containing the sample sizes nh, i.e. the number of units to sample in each stratum. See stratification-package for information about the rounding used to get these integer values.

n

The total sample size (sum(nh)).

nhnonint

A vector of length L containing the non-integer values of the sample sizes, obtained directly from applying the allocation rule (see stratification-package).

certain.info

A vector giving statistics for the certainty stratum (see stratification-package). It contains Nc, the number of units chosen a priori to be in the sample, and meanc, the anticipated mean of Y for these units.

opti.nh

The final value of the criteria to optimize (either the total sample size n if a target CV was given or the RRMSE if a target n was given) calculated with the integer stratum sample sizes nh.

opti.nhnonint

The final value of the criteria to optimize (either the total sample size n if a target CV was given or the RRMSE if a target n was given) calculated with the non-integer stratum sample sizes nhnonint.

meanh

A vector of length L containing the anticipated means of Y in each stratum.

varh

A vector of length L containing the anticipated variances of Y in each stratum.

mean

A numeric: the anticipated global mean value of Y.

RMSE

A numeric: the root mean squared error (or standard error if takenone=0) of the anticipated global mean of Y. This is defined as the squared root of: (bias.penalty x bias of the mean)^2 + variance of the mean.

RRMSE

A numeric: the anticipated relative root mean squared error (or coefficient of variation if takenone=0) for the mean of Y, i.e. RMSE divided by mean.

relativebias

A numeric: the anticipated relative bias of the estimator, i.e. (bias.penalty x bias of the mean) divided by mean. If takenone=0, this numeric is zero.

propbiasMSE

A numeric: the proportion of the MSE attributable to the bias of the estimator, i.e. (bias.penalty x bias of the mean)^2 divided by the MSE of the mean. If takenone=0, this numeric is zero.

stratumID

A factor, having the same length as the input x, which values are either 1, 2, ..., L or "certain". The value "certain" is given to units a priori chosen to be in the sample. This factor identifies, for each observation, the stratum to which it has been assigned.

takeall

The number of take-all strata in the final solution. Note: It is possible that n_h=N_h for non take-all strata because the condition for an automatic addition of a take-all stratum is n_h>N_h.

call

The function call (object of class "call").

date

A character string that contains the system date and time when the function ended.

args

A list of all the argument values input to the function or set by default.

Author(s)

Sophie Baillargeon Sophie.Baillargeon@mat.ulaval.ca and
Louis-Paul Rivest Louis-Paul.Rivest@mat.ulaval.ca

References

Baillargeon, S. and Rivest L.-P. (2011). The construction of stratified designs in R with the package stratification. Survey Methodology, 37(1), 53-65.

See Also

print.strata, plot.strata, strata.cumrootf, strata.geo, strata.LH

Examples

1
2
3
4
5
6
7
8
9
adjust <- strata.geo(x=USbanks, CV=0.01, Ls=4, alloc=c(0.35,0.35,0))
adjust
adjust$nhnonint
noadjust <- strata.bh(x=USbanks, bh=adjust$bh, CV=0.01, Ls=4,
            alloc=c(0.35,0.35,0), takeall=0, takeall.adjust=FALSE)
noadjust
noadjust$nhnonint
# without the adjustment for a take-all stratum, n is smaller than
# with the adjustment, but the target CV is not reached.

Example output

Given arguments:
x = USbanks
CV = 0.01, Ls = 4
allocation: q1 = 0.35, q2 = 0.35, q3 = 0
model = none

Strata information:
          |      type rh |     bh   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 | 135.30 103.26   382.33 156  33 0.21
stratum 2 | take-some  1 | 261.51 178.28  1136.94 109  37 0.34
stratum 3 | take-some  1 | 505.47 371.86  5350.88  63  42 0.67
stratum 4 |  take-all  1 | 978.00 744.10 23105.75  29  29 1.00
Total                                             357 141 0.39

Total sample size: 141 
Anticipated population mean: 225.6246 
Anticipated CV: 0.009869827 
[1] 32.05732 36.55587 41.66591 29.00000
Given arguments:
x = USbanks
CV = 0.01, Ls = 4, takenone = 0, takeall = 0
allocation: q1 = 0.35, q2 = 0.35, q3 = 0
model = none

Strata information:
          |      type rh     bh |   E(Y)   Var(Y)  Nh  nh   fh
stratum 1 | take-some  1 135.30 | 103.26   382.33 156  29 0.19
stratum 2 | take-some  1 261.51 | 178.28  1136.94 109  34 0.31
stratum 3 | take-some  1 505.47 | 371.86  5350.88  63  38 0.60
stratum 4 | take-some  1 978.00 | 744.10 23105.75  29  29 1.00
Total                                             357 130 0.36

Total sample size: 130 
Anticipated population mean: 225.6246 
Anticipated CV: 0.01079707 
Note: CV=RRMSE (Relative Root Mean Squared Error) because takenone=0.
[1] 28.98880 33.05675 37.67766 35.57166

stratification documentation built on May 1, 2019, 9:13 p.m.