y_HT: Calculate the Horvitz-Thompson mean of an adaptive cluster...
In ksauby/ACS: Adaptive Cluster Sampling

View source: R/y_HT.R

y_HT	R Documentation

Calculate the Horvitz-Thompson mean of an adaptive cluster sample.

Description

This calculate the Horvitz-Thompson mean of an adaptive cluster sample done by sampling without replacement.

where v is the number of distinct units in the sample and J_k is an indicator variable, equalling 0 if the k^{th} unit in the sample does not satisfy the condition and was not selected in the initial sample; otherwise, J_k = 1.

Usage

y_HT(
  y,
  N,
  n1,
  pi_i_values = NULL,
  m_vec = NULL,
  sampling = NULL,
  criterion = NULL
)

Arguments

`y`	The variable of interest, y. Must be a numeric vector. The criterion that determines whether adaptive cluster sampling takes place is based on this variable.
`N`	Population size.
`n1`	An integer giving the initial sample size (e.g., a simple random sample).
`pi_i_values`	vector of inclusion probabilities, if not calculated using this function. Default is `NULL`.
`m_vec`	Vector of values m for the set of units in a sample, of length n1. Each m value within the vector `m_vec` denotes the number of units satisfying the ACS criterion for the network i to which the unit belongs.
`sampling`	A vector (`character` format) describing whether units were included in the initial sample or subsequent ACS sample. Units selected in the initial sample should be given the value "Initial_Sample" in the `sampling` vector.
`criterion`	Numeric threshold value of the variable of interest y (whose name in the dataframe $popdata$ is supplied via the `yvar` argument) that initiates ACS. Defaults to 0 (ie., anything greater than 0 initiates adaptive cluster sampling).

Value

The Horvitz-Thompson mean.

References

\insertRef

thompson1990adaptiveACSampling

Examples

library(magrittr)
library(plyr)
library(dplyr)
library(ggplot2)

# EXAMPLE 1: Sampling of population from Figure 1, Thompson (1990)

data(Thompson1990Fig1Pop)
data(Thompson1990Figure1Sample)

# plot sample overlaid onto population
ggplot() +
	geom_point(data=Thompson1990Fig1Pop, aes(x,y, size=factor(y_value),
		shape=factor(y_value))) +
	scale_shape_manual(values=c(1, rep(16, length(2:13)))) +
	geom_point(data=Thompson1990Figure1Sample, aes(x,y), shape=0, size=7)

Z <- createACS(popdata=Thompson1990Fig1Pop, 
n1=dim(Thompson1990Figure1Sample)[1], yvar="y_value", 
initsample=Thompson1990Figure1Sample)
# CALCULATE y_HT
y_HT(
	N = dim(Thompson1990Fig1Pop)[1], 
	n1 = dim(Thompson1990Figure1Sample)[1],
	m_vec = Z$m, 
	y = Z$y_value, 
	sampling = Z$Sampling,
	criterion=0
) 

# EXAMPLE 2: Table 1 from Thompson (1990)
#data(Thompson1990Table1data)
#(Thompson1990Table1 = Thompson1990Table1data %>%
#group_by(sampling_effort) %>%
#summarise(
# 	`y (added through SRSWOR)` = toString(y_value[which(sampling=="SRSWOR")]),
#	`y (added through ACS)` = toString(y_value[which(sampling=="ACS")]),
#	y_bar_1 = mean(y_value[which(sampling=="SRSWOR")]),
#	y_HT = round(y_HT(N=dim(Thompson1990Fig1Pop)[1], n1=2, m_vec=m, y=y_value, sampling, 5), 2),
#	y_bar = round(mean(y_value),2)
#	)
#)

# EXAMPLE 3: 
# data(cactus_realizations)
# realization = cactus_realizations %>% filter(n.networks==40)
# Ch. 24, Exercise #2, p. 307, from Thompson (2002)
N = 1000
n1 = 100
m_vec = c(2,3, rep(1,98))
y = c(3,6, rep(0, 98))
sampling = "SRSWOR"
criterion = 0
round(
     y_HT(N,n1,m_vec,y,sampling,criterion)*1000, 0
)

ksauby/ACS documentation built on Aug. 18, 2022, 3:33 a.m.