R/rss.prop.sampling.R

Defines functions rss.prop.sampling

Documented in rss.prop.sampling

#' @details This function performs balanced or unbalanced ranked set sampling for proportions from a given data set. The length of the sample allocation vector (nsamp) must match the set size (H). The subject ID and auxiliary variable (X) must have the same length.
#' @title Generate ranked set samples for proportions
#' @name rss.prop.sampling
#' @description The rss.prop.sampling function generates ranked set samples for proportions by performing ranked set sampling directly on a given population data set using an auxiliary variable (X) and subject IDs.
#'
#' @param ID A numeric vector of subject IDs from the population. IDs must be unique.
#' @param X A numeric vector of auxiliary variable used for ranking. Must have the same length as ID.
#' @param H The RSS set size
#' @param nsamp A numeric vector specifying the sample allocation for each stratum.
#'
#' @return A data frame with the following columns:
#' \item{ID}{The sampled subjects' IDs.}
#' \item{rank}{The rank information assigned to each sample.}
#' @examples
#' ## Example 1: Balanced RSS with equal sample sizes.
#' data(iris)
#' id=1:nrow(iris)
#' X=ifelse(iris$Sepal.Length<5.8,0,1)
#' rss.prop.data=rss.prop.sampling(ID=id, X=X, H=3,nsamp=c(6,6,6))
#'
#' ## Example 2: Unbalanced RSS with different sample sizes.
#' rss.prop.data=rss.prop.sampling(ID=id, X=X, H=3, nsamp=c(6,10,8))
#'
#' # Check the structure of the RSS data
#' colnames(rss.prop.data) # include "ID", "rank", and "Y"
#' head(rss.prop.data$ID)
#' head(rss.prop.data$rank)
#'
#' @export
rss.prop.sampling <- function(ID, X, H, nsamp)
{
  if (any(duplicated(ID))) stop("ID must be unique.", call. = FALSE)
  if(length(ID)!= length(X)) stop("ID and X must have the same length.", call.=F)

  n=sum(nsamp)
  data=matrix(0,n,2)

  if(H != length(nsamp)) stop("Set size are different with the length of sample allocations", call. = F)

  idx=sample(ID,H*n,replace=F)
  ind=NULL
  for(i in 1:length(idx)){
      ind=c(ind, which(idx[i]==ID))
  }

  X.sel <- matrix(X[ind],ncol=H)
  ID.sel <- matrix(ID[ind],ncol=H)

  for(h in (1:H)){
    if(nsamp[h]!=0){
      for(i in (1:nsamp[h])){
        k=i
        if(h>1){
          k=i+sum(nsamp[1:(h-1)])
        }
        rdata=rank(X.sel[k,],ties.method='first')

        data[k,1]=h
        data[k,2]=ID.sel[k,rdata==h]
      }
    }
  }

  colnames(data) <- c("rank","ID")
  data<-as.data.frame(data)
  return(data)
}

Try the generalRSS package in your browser

Any scripts or data that you put into this service are public.

generalRSS documentation built on April 4, 2025, 12:19 a.m.