rhg_strata: Response homogeneity groups for a stratified sampling

View source: R/rhg_strata.r

rhg_strataR Documentation

Response homogeneity groups for a stratified sampling

Description

Computes response homogeneity groups and the corresponding response probability for each unit into a group, for a stratified sampling.

Usage

rhg_strata(X,selection)

Arguments

X

sample data frame; it should contain the columns 'ID_unit','Stratum', and 'status'; 'ID_unit' denotes the unit identifier (a number); 'Stratum' denotes the unit stratum; 'status' is a 1/0 variable denoting the response/non-response of a unit in the sample.

selection

vector of variable names in X used to construct the groups.

Details

Into a response homogeneity group, the reponse probability is the same for all units. Data are missing at random within groups, conditionally on the selected sample.

Value

The initial sample data frame and also the following components:

rhgroup

response homogeneity group for each unit, conditionally on its stratum.

prob_response

response probability for each unit; for the units with status=0, this probability is 0.

References

Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer

See Also

rhg, calib

Examples

############
## Example 1
############
# uses Example 2 from the 'strata' function help file
data=rbind(matrix(rep("nc",165),165,1,byrow=TRUE),matrix(rep("sc",70),70,1,byrow=TRUE))
data=cbind.data.frame(data,c(rep(1,100), rep(2,50), rep(3,15), rep(1,30),rep(2,40)),
1000*runif(235))
names(data)=c("state","region","income")
# draws a sample
s1=strata(data,c("region","state"),size=c(10,5,10,4,6), method="systematic",
pik=data$income)
# extracts the observed data
s1=getdata(data,s1)
# randomly generates the 'status' variable (1-sample respondent, 0-otherwise)
status=ifelse(runif(nrow(s1))<0.3,0,1)
# adds the 'status' variable to the sample data frame s1
s1=cbind.data.frame(s1,status)
# creates classes of income using the median of income
# suppose that the income is available for all units in the sample
classincome=ifelse(s1$income<median(s1$income),1,2)
# adds 'classincome' to s1
s1=cbind.data.frame(s1,classincome)
# computes the response homogeneity groups using the 'classincome' variable   
rhg_strata(s1,selection=c("classincome"))
############
## Example 2
############
# the same data as in Example 1
# but we also add the 'sex' column (1-female, 2-male)
# suppose that the sex is available for all units in the sample
sex=c(rep(1,12),rep(2,8),rep(1,10),rep(2,5))
s1=cbind.data.frame(s1,sex)
# computes the response homogeneity groups using the 'classincome' and 'sex' variables   
rhg_strata(s1,selection=c("classincome","sex"))

sampling documentation built on Nov. 2, 2023, 6:26 p.m.