# strata.distr: Stratification of Univariate Survey Population Using the... In stratifyR: Optimal Stratification of Univariate Populations

## Description

This function takes in the underlying hypothetical distribution and its parameter(s) of the survey variable, the initial value and the range of the population, the fixed sample size (n) and the fixed population size (N) to compute the optimum stratum boundaries (OSB) for a given number of strata (L), optimum sample sizes (nh), etc. The main idea used is from Khan et al. (2008) whereby the problem of stratification is fromulated into a Mathematical Programming Problem (MPP) using the best-fit frequency distribution and its parameter estimates of the data. This MPP is then solved for the optimal solutions using the Dynamic Programming (DP) solution procedure.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13``` ```strata.distr( h, initval, dist, distr = c("pareto", "triangle", "rtriangle", "weibull", "gamma", "exp", "unif", "norm", "lnorm", "cauchy"), params = c(shape = 0, scale = 0, rate = 0, gamma = 0, location = 0, mean = 0, sd = 0, meanlog = 0, sdlog = 0, min = 0, max = 0, mode = 0), n, N, cost = FALSE, ch = NULL ) ```

## Arguments

 `h` A numeric: denotes the number of strata to be created. `initval` A numeric: denotes the initial value of the population `dist` A numeric: denotes distance (or range) of the population `distr` A character: denotes the name of the distribution that characterizes the population `params` A list: contains the values of all parameters of the distribution `n` A numeric: denotes the fixed total sample size. `N` A numeric: denotes the fixed total population size. `cost` A logical: has default cost=FALSE. If it is a stratum-cost problem, cost=TRUE, with which one must provide the Ch parameter. `ch` A numeric: denotes a vector of stratum costs.

## Value

`strata.distr` returns Optimum Strata Boundaries (OSB), stratum weights (Wh), stratum costs (Ch), stratum variances (Vh), Optimum Sample Sizes (nh), stratum population sizes (Nh).

## Author(s)

Karuna Reddy <karuna.reddy@usp.ac.fj>
MGM Khan <khan_mg@usp.ac.fj>

`strata.data`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34``` ```## Not run: #Assume data has initial value of 1.5, distance of 33 and follows #weibull distribution with estimated parameters as shape=2.15 and scale=13.5 #To compute the OSB, OSS, etc. with fixed sample n=500, we use: res <- strata.distr(h=2, initval=1.5, dist=33, distr = "weibull", params = c(shape=2.15, scale=13.5), n=500, N=2000, cost=FALSE) summary(res) #------------------------------------------------------------- #Assume data has initial value of 1, distance of 10415 and follows #lnorm distribution with estimated parameters as meanlog=5.5 and sdlog=1.5 #To compute the OSB, OSS, etc. with fixed sample n=500, we use: res <- strata.distr(h=2, initval=1, dist=10415, distr = "lnorm", params = c(meanlog=5.5, sdlog=1.5), n=500, N=12000) summary(res) #------------------------------------------------------------- #Assume data has initial value of 2, distance of 68 and follows #gamma distribution with estimated parameters as shape=3.8 and rate=0.55 #To compute the OSB, OSS, etc. with fixed sample n=500, we use: res <- strata.distr(h=2, initval=0.65, dist=68, distr = "gamma", params = c(shape=3.8, rate=0.55), n=500, N=10000) summary(res) #------------------------------------------------------------- #The function be dynamically used to visualize the the strata boundaries, #for 2 strata, over the density (or observations) of the "mag" variable #from the quakes data (with purrr and ggplot2 packages loaded). res <- strata.distr(h=2, initval=4, dist=2.4, distr = "lnorm", params = c(meanlog=1.52681032, sdlog=0.08503554), n=300, N=1000) quakes %>% ggplot(aes(x = mag)) + geom_density(fill = "blue", colour = "black", alpha = 0.3) + geom_vline(xintercept = res\$OSB, linetype = "dotted", color = "red") #------------------------------------------------------------- ## End(Not run) ```