R/VarSYGHT.R
In TeachingSampling: Selection of Samples and Parameter Estimation in Finite Population

#' @export
#' 
#' @title
#' Two different varaince estimators for the Horvitz-Thompson estimator
#' @description 
#' This function estimates the variance of the Horvitz-Thompson estimator.
#' Two different variance estimators are computed: the original one, due to Horvitz-Thompson
#' and the one due to Sen (1953) and Yates, Grundy (1953). 
#' The two approaches yield unbiased estimator under fixed-size sampling  schemes. 
#' @return 
#' This function returns a data frame of every possible sample in 
#' within a sampling support, with its corresponding variance estimates.
#' @details
#' The function returns two variance estimator for every possible sample
#' within a fixed-size sampling support. 
#' The first estimator is due to Horvitz-Thompson and is given by the following expression:
#' \deqn{\widehat{Var}_1(\hat{t}_{y,\pi}) = \sum_{k \in U}\sum_{l\in U}\frac{\Delta_{kl}}{\pi_{kl}}\frac{y_k}{\pi_k}\frac{y_l}{\pi_l}}
#' The second estimator is due to Sen (1953) and Yates-Grundy (1953). It is given by the following expression:
#' \deqn{\widehat{Var}_2(\hat{t}_{y,\pi}) = -\frac{1}{2}\sum_{k \in U}\sum_{l\in U}\frac{\Delta_{kl}}{\pi_{kl}}(\frac{y_k}{\pi_k} - \frac{y_l}{\pi_l})^2}
#' @author Hugo Andres Gutierrez Rojas <hagutierrezro at gmail.com> 
#' @param y Vector containing the information of the characteristic of interest 
#' for every unit in the population.
#' @param N Population size.
#' @param n Sample size.
#' @param p A vector containing the selection probabilities of a fixed size without replacement sampling design. 
#' The sum of the values of this vector must be one.
#' 
#' @references 
#' Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), \emph{Model Assisted Survey Sampling}. Springer.\cr
#' Gutierrez, H. A. (2009), \emph{Estrategias de muestreo: Diseno de encuestas 
#' y estimacion de parametros}. Editorial Universidad Santo Tomas.
#' 
#' @examples
#' 
#' # Example 1
#' # Without replacement sampling
#' # Vector U contains the label of a population of size N=5
#' U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie")
#' # Vector y1 and y2 are the values of the variables of interest
#' y1<-c(32, 34, 46, 89, 35)
#' y2<-c(1,1,1,0,0)
#' # The population size is N=5
#' N <- length(U)
#' # The sample size is n=2
#' n <- 2
#' # p is the probability of selection of every possible sample
#' p <- c(0.13, 0.2, 0.15, 0.1, 0.15, 0.04, 0.02, 0.06, 0.07, 0.08)
#' 
#' # Calculates the estimated variance for the HT estimator
#' VarSYGHT(y1, N, n, p)
#' VarSYGHT(y2, N, n, p)
#'
#' # Unbiasedness holds in the estimator of the total
#' sum(y1)
#' sum(VarSYGHT(y1, N, n, p)$p * VarSYGHT(y1, N, n, p)$Est.HT)
#' sum(y2)
#' sum(VarSYGHT(y2, N, n, p)$p * VarSYGHT(y2, N, n, p)$Est.HT)
#' 
#' # Unbiasedness also holds in the two variances
#' VarHT(y1, N, n, p)
#' sum(VarSYGHT(y1, N, n, p)$p * VarSYGHT(y1, N, n, p)$Est.Var1)
#' sum(VarSYGHT(y1, N, n, p)$p * VarSYGHT(y1, N, n, p)$Est.Var2)
#' 
#' VarHT(y2, N, n, p)
#' sum(VarSYGHT(y2, N, n, p)$p * VarSYGHT(y2, N, n, p)$Est.Var1)
#' sum(VarSYGHT(y2, N, n, p)$p * VarSYGHT(y2, N, n, p)$Est.Var2)
#' 
#' # Example 2: negative variance estimates
#' 
#' x = c(2.5, 2.0, 1.1, 0.5)
#' N = 4
#' n = 2
#' p = c(0.31, 0.20, 0.14, 0.03, 0.01, 0.31)
#'
#' VarSYGHT(x, N, n, p)
#' 
#' # Unbiasedness holds in the estimator of the total
#' sum(x)
#' sum(VarSYGHT(x, N, n, p)$p * VarSYGHT(x, N, n, p)$Est.HT)
#' 
#' # Unbiasedness also holds in the two variances
#' VarHT(x, N, n, p)
#' sum(VarSYGHT(x, N, n, p)$p * VarSYGHT(x, N, n, p)$Est.Var1)
#' sum(VarSYGHT(x, N, n, p)$p * VarSYGHT(x, N, n, p)$Est.Var2)

VarSYGHT <- function (y, N, n, p) 
{
  Ind <- Ik(N, n)
  pi1 <- as.matrix(Pik(p, Ind))
  pi2 <- Pikl(N, n, p)
  Delta <- Deltakl(N, n, p)
  y <- t(as.matrix(y))
  ykylexp <- t(y/pi1) %*% (y/pi1)
  A <- (Delta/pi2) * (ykylexp)
  Q <- nrow(Ind)
  MatDif <- matrix(NA, nrow = N, ncol = N)
  for(k in 1:N){
    for(l in 1:N){
      MatDif[k, l] <- (y[k]/pi1[k] - y[l]/pi1[l])^2 
    }
  }
  B <- (Delta/pi2) * MatDif
  
  Est.Var1 = Est.Var2 = Est.HT =NULL
  for(i in 1:Q){
    index = which(Ind[i,] != 0)
    Est.HT[i] = HT(y[index], pi1[index])
    Est.Var1[i] = sum(A[index, index])
    Est.Var2[i] = - (1/2) * sum(B[index, index])
  }
  Resultado <- data.frame(I = Ind, p = p, Est.HT = Est.HT, Est.Var1 = Est.Var1, Est.Var2 = Est.Var2)
  return(Resultado)
}

Any scripts or data that you put into this service are public.

TeachingSampling documentation built on April 22, 2020, 1:05 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

TeachingSampling
Selection of Samples and Parameter Estimation in Finite Population

R/VarSYGHT.R
In TeachingSampling: Selection of Samples and Parameter Estimation in Finite Population

Try the TeachingSampling package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

TeachingSampling Selection of Samples and Parameter Estimation in Finite Population

R/VarSYGHT.R In TeachingSampling: Selection of Samples and Parameter Estimation in Finite Population

Try the TeachingSampling package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

TeachingSampling
Selection of Samples and Parameter Estimation in Finite Population

R/VarSYGHT.R
In TeachingSampling: Selection of Samples and Parameter Estimation in Finite Population