# strata.data: Stratification of Univariate Survey Population Using the Data In stratifyR: Optimal Stratification of Univariate Populations

## Description

This function takes in the univariate population data (argument `data`) and a fixed sample size (n) to compute the optimum stratum boundaries (OSB) for a given number of strata (L), optimum sample sizes (nh), etc. directly from the data. The main idea used is from Khan et al (2008) whereby the problem of stratification is formulated into a Mathematical Programming Problem (MPP) using the best-fit frequency distribution and its parameters estimated from the data. This MPP is then solved for the OSB using a Dynamic Programming (DP) solution procedure.

## Usage

 `1` ```strata.data(data, h, n, cost = FALSE, ch = NULL) ```

## Arguments

 `data` A vector of values of the survey variable y for which the OSB are determined `h` A numeric: denotes the number of strata to be created. `n` A numeric: denotes a fixed total sample size. `cost` A logical: has default cost=FALSE. If it is a stratum-cost problem, cost=TRUE, with which, one must provide the Ch parameter. `ch` A numeric: denotes a vector of stratum costs. When cost=FALSE, it has a default of NULL.

## Value

`strata.data` returns Optimum Strata Boundaries (OSB), stratum weights (Wh), stratum variances (Vh), Optimum Sample Sizes (nh), stratum population sizes (Nh) and sampling fraction (fh).

## Author(s)

Karuna Reddy <reddy_k@usp.ac.fj>
MGM Khan <khan_mg@usp.ac.fj>

`strata.distr`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36``` ```## Not run: data <- rweibull(1000, shape=2, scale = 1.5) hist(data) obj <- strata.data(data, h = 2, n=300) summary(obj) #------------------------------------------------------------- data(anaemia) Iron <- anaemia\$Iron res <- strata.data(Iron, h = 2, n=350) summary(res) #------------------------------------------------------------- data(SHS) #Household Spending data from stratification package weight <- SHS\$WEIGHT hist(weight); length(weight) res <- strata.data(weight, h = 2, n=500) summary(res) #------------------------------------------------------------- data(sugarcane) Production <- sugarcane\$Production hist(Production) res <- strata.data(Production, h = 2, n=1000) summary(res) #------------------------------------------------------------- #The function be dynamically used to visualize the the strata boundaries, #for 2 strata, over the density (or observations) of the "mag" variable #from the quakes data (with purrr and ggplot2 packages loaded). output <- quakes %>% pluck("mag") %>% strata.data(h = 2, n = 300) quakes %>% ggplot(aes(x = mag)) + geom_density(fill = "blue", colour = "black", alpha = 0.3) + geom_vline(xintercept = output\$OSB, linetype = "dotted", color = "red") #------------------------------------------------------------- ## End(Not run) ```