reSamGF: Gross flows variance estimation.
In WaceroRuge/GFE: Gross Flows Estimation under Complex Surveys

Description Usage Arguments Details Value References Examples

Gross flows variance estimation according to resampling method (Bootstrap or Jackknife).

1 2	reSamGF(sampleBase = NULL, nRepBoot = 500, model = "I", niter = 100, type = "Bootstrap", colWeights = NULL, nonrft = FALSE)

`sampleBase`	An object of class data.frame or data.table containing the sample selected to estimate the gross flows.
`nRepBoot`	The number of replicates for the bootstrap method.
`model`	A character indicating the model that will be used for estime the gross flows. The available models are: 'I','II','III','IV'.
`niter`	The number of iterations for the η_{i} and p_{ij} model parameters.
`type`	A character indicating the resampling method ("Bootstrap" or "Jackknife")
`colWeights`	The data colum name containing the sampling weights to be used on the fitting process.
`nonrft`	a logical value indicating the non response for the first time.

The resampling methods for variance estimation are:

Bootstrap:: This technique allows to estimate the sampling distribution of almost any statistic by using random sampling methods. Bootstrapping is the practice of estimating properties of an statistic (such as its variance) by measuring those properties from it's approximated sample.
Jackknife:: The jackknife estimate of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. Given a sample of size n, the jackknife estimate is found by aggregating the estimates of each n-1-sized sub-sample.

reSamGF returns a list that contains the variance of each parameter of the selected model.

Efron, B. (1979), ‘Computers and the theory of statistics: Thinking the unthinkable’, SIAM review 21(4), pp. 460-480.
Quenouille, M. H. (1949), ‘Problems in plane sampling’, The Annals of Mathematical Statistics pp. 355-375.
Tukey, J. W. (1958), ‘Bias and confidence in not-quite large samples’, Annals of Mathematical Statistics 29, pp. 614.

library(TeachingSampling)
library(data.table)
# Colombia's electoral candidates in 2014
candidates_t0 <- c("Clara","Enrique","Santos","Martha","Zuluaga","Blanco", "NoVoto")
candidates_t1 <- c("Santos","Zuluaga","Blanco", "NoVoto")

N 	   <- 100000
nCanT0  <- length(candidates_t0)
nCanT1  <- length(candidates_t1)

# Initial probabilities 
eta <- matrix(c(0.10, 0.10, 0.20, 0.17, 0.28, 0.1, 0.05),
			   byrow = TRUE, nrow = nCanT0)
# Transition probabilities
P <- matrix(c(0.10, 0.60, 0.15, 0.15,
			  0.30, 0.10, 0.25,	0.35,
			  0.34, 0.25, 0.16, 0.25,
			  0.25,	0.05, 0.35,	0.35,
			  0.10, 0.25, 0.45,	0.20,
			  0.12, 0.36, 0.22, 0.30,
			  0.10,	0.15, 0.30,	0.45),
	 byrow = TRUE, nrow = nCanT0)

citaMod <- matrix(, ncol = nCanT1, nrow = nCanT0)
row.names(citaMod) <- candidates_t0
colnames(citaMod) <- candidates_t1

for(ii in 1:nCanT0){ 
 citaMod[ii,] <- c(rmultinom(1, size = N * eta[ii,], prob = P[ii,]))
}

# # Model I
psiI   <- 0.9
rhoRRI <- 0.9
rhoMMI <- 0.5

citaModI <- matrix(nrow = nCanT0 + 1, ncol = nCanT1 + 1)
rownames(citaModI) <- c(candidates_t0, "Non_Resp")
colnames(citaModI) <- c(candidates_t1, "Non_Resp")

citaModI[1:nCanT0, 1:nCanT1] 		 <- P * c(eta) * rhoRRI * psiI  
citaModI[(nCanT0 + 1), (nCanT1 + 1)]  <- rhoMMI * (1-psiI) 
citaModI[1:nCanT0, (nCanT1 + 1)]  	 <-  (1-rhoRRI) * psiI * rowSums(P * c(eta))
citaModI[(nCanT0 + 1), 1:nCanT1 ] 	 <-  (1-rhoMMI) * (1-psiI) * colSums(P * c(eta))
citaModI   <- round_preserve_sum(citaModI * N)
DBcitaModI <- createBase(citaModI)

# Creating auxiliary information
DBcitaModI[,AuxVar := rnorm(nrow(DBcitaModI), mean = 45, sd = 10)]
# Selects a sample with unequal probabilities
res <- S.piPS(n = 1200, as.data.frame(DBcitaModI)[,"AuxVar"])
sam <- res[,1]
pik <- res[,2]
DBcitaModISam <- copy(DBcitaModI[sam,])
DBcitaModISam[,Pik := pik]

# Gross flows estimation
estima <- estGF(sampleBase = DBcitaModISam, niter = 500, model = "II", colWeights = "Pik")
# gross flows variance estimation
varEstima <- reSamGF(sampleBase = DBcitaModISam, type = "Bootstrap", nRepBoot = 100,
						model = "II", niter = 101,  colWeights = "Pik")
varEstima