knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The aim of this document is to illustrate how to automatically generate some types of sample probabilities using functions runChecksOnSelectionAndProbs
and applyGenerateProbs
of the RDBEScore
package.
library(RDBEScore)
RDBEScore
currently only provides probability generation for selection method "SRSWR","SRSWOR" and "CENSUS". Also, to automatically generate probabilities, it is necessary that numTotal
and numSamp
are declared in every sampling table. Functions are also only configured to handle for cases where "clustering=="N". If that is not the case, changes need to be made to the data before running the functions. Such changes configure significant assumptions that should be left well visible in the data preparation section of any estimation script.
First we'll load some example data from the RDBES and check it's valid. It's a good tip to check your RDBESDataObjects are valid after any manipulations you perform. See how to import your own data in the vignette Import RDBES data In this vignette package example data pre-loaded with RDBEScore
is used.
# load some H1 test data myH1DataObject <- H1Example # filter data for DEstratumName==DE_stratum1_H1 to make object smaller and easier to handle myH1DataObject <- filterAndTidyRDBESDataObject(myH1DataObject,c("DEstratumName"), c("DE_stratum1_H1"), killOrphans=TRUE)
The functions that generate probabilities do not yet deal with lower hierarchies A and B so we rework a bit the data so it looks like lower hierarchy C.
# Temp fixes to change data to lower hierarchy C - function won't deal with A, or B yet myH1DataObject[["BV"]] <- dplyr::distinct(myH1DataObject[["BV"]], FMid, .keep_all = TRUE) temp <- dplyr::left_join(myH1DataObject[["BV"]][,c("BVid","FMid")], myH1DataObject[["FM"]][,c("FMid","SAid")], by="FMid") myH1DataObject[["BV"]]$SAid <- temp$SAid myH1DataObject[["BV"]]$FMid <- NA myH1DataObject[["SA"]]$SAlowHierarchy <- "C" myH1DataObject[["BV"]]$BVnumTotal <- 10 myH1DataObject[["BV"]]$BVnumSamp <- 10 # reworking stratification of VS table myH1DataObject[["VS"]][VSencrVessCode %in% c("VDcode_5","VDcode_8","VDcode_9")]$VSstratumName <- "VS_stratum1" myH1DataObject[["VS"]][VSencrVessCode %in% c("VDcode_5","VDcode_8","VDcode_9")]$VSnumTotal <- 30 myH1DataObject[["VS"]][VSencrVessCode %in% c("VDcode_6","VDcode_7","VDcode_10")]$VSstratumName <- "VS_stratum2" myH1DataObject[["VS"]][VSstratumName %in% "VS_stratum1",]$VSnumSamp <- 5 myH1DataObject[["VS"]][VSstratumName %in% "VS_stratum2",]$VSnumSamp <- 4 # reworking FT table myH1DataObject[["FT"]]$FTselectMeth <- "SRSWOR" tmp<-myH1DataObject[["FT"]] tmp$VSencrVessCode<-myH1DataObject[["VS"]]$VSencrVessCode[match(myH1DataObject[["FT"]]$VSid, myH1DataObject[["VS"]]$VSid)] tmp$FTnumSamp<-as.integer(table(tmp$VSencrVessCode))[match(tmp$VSencrVessCode, names(table(tmp$VSencrVessCode)))] tmp[tmp$VSencrVessCode %in% c("VDcode_5"),]$FTnumTotal<-100 tmp[tmp$VSencrVessCode %in% c("VDcode_6"),]$FTnumTotal<-50 tmp[tmp$VSencrVessCode %in% c("VDcode_7"),]$FTnumTotal<-25 tmp[tmp$VSencrVessCode %in% c("VDcode_8"),]$FTnumTotal<-80 tmp[tmp$VSencrVessCode %in% c("VDcode_9"),]$FTnumTotal<-70 tmp[tmp$VSencrVessCode %in% c("VDcode_10"),]$FTnumTotal<-60 tmp$VSencrVessCode<-NULL tmp$FTstratumName<-"U" tmp$FTstratification<-"N" myH1DataObject[["FT"]]<-tmp myH1DataObject[["FO"]]$FOselectMeth<-"SRSWOR" myH1DataObject$SA$SAselectMeth<-"SRSWOR" myH1DataObject$SS$SSselectMeth<-"SRSWOR" myH1DataObject$BV$BVselectMeth<-"SRSWOR" # confirm validity validateRDBESDataObject(myH1DataObject)
The final data contains 10 ages in each of 243 hauls sampled from 81 trips done by 9 selected vessels.
Examining the selection methods used in the VS table it is visible that the 9 vessels were selected with replacement (SRSWR) out of two strata, one strata with a total of 30 vessels (VS_stratum1) and one with a total of 60 vessels (VS_stratum2). It is also noticeable that selection and inclusion probabilities were not declared during upload.
unique(myH1DataObject[["VS"]][,c("VSstratification","VSstratumName","VSselectMeth", "VSnumTotal","VSnumSamp","VSselProb","VSincProb")])
With regards to trips these were selected without replacement (SRSWOR). either 9 or 18 trips were selected from each vessel. Individual vessels registered total number of trips between 25 and 100 trips. We also see that selection and inclusion probabilities were not declared.
unique(myH1DataObject[["FT"]][,c("VSid","FTstratification","FTstratumName","FTselectMeth", "FTnumTotal","FTnumSamp","FTselProb","FTincProb")])
With regards to hauls the example data indicates that 20 were done in every trip(!) from which 3 were sampled. Not very likely data, but good enough for demonstration purposes. Also here we see that selection and inclusion probabilities were not declared.
table(myH1DataObject[["FO"]]$FTid) unique(myH1DataObject[["FO"]][,c("FOid","FOstratification","FOstratumName","FOselectMeth", "FOnumTotal","FOnumSamp","FOselProb","FOincProb")])
generateProbs
To generate probabilities for one of the tables choose what type of probabilities you want to generate ("selection" or "inclusion") and run generateProbs
.
note: To check the data for some issues related to selection methods and probabilities, you can run function runChecksOnSelectionAndProbs
. But in general this is not necessary because when you run applyGenerateProbs
with defaults a call to runChecksOnSelectionAndProbs
is included.
myH1DataObject_uptde<-myH1DataObject myH1DataObject_uptde[["VS"]] <- generateProbs(myH1DataObject[["VS"]], probType="inclusion") # display changes myH1DataObject_uptde[["VS"]][,c("VSstratification","VSstratumName","VSselectMeth", "VSnumTotal","VSnumSamp","VSselProb","VSincProb")] myH1DataObject_uptde[["VS"]] <- generateProbs(myH1DataObject_uptde[["VS"]], probType="selection") # display changes myH1DataObject_uptde[["VS"]][,c("VSstratification","VSstratumName","VSselectMeth", "VSnumTotal","VSnumSamp","VSselProb","VSincProb")]
applyGenerateProbs
The function applyGenerateProbs
generates selection or inclusion probabilities for all selection tables of an RDBES data object in one go. Here, we avoid running the checks by setting runInitialProbChecks
to FALSE.
myH1DataObject_uptde<-applyGenerateProbs (x = myH1DataObject , probType = "inclusion" , overwrite=T , runInitialProbChecks = FALSE) validateRDBESDataObject(myH1DataObject_uptde) # display changes myH1DataObject_uptde[["VS"]][,c("VSstratification","VSstratumName","VSselectMeth", "VSnumTotal","VSnumSamp","VSselProb","VSincProb")] unique(myH1DataObject_uptde[["FT"]][,c("VSid","FTstratification","FTstratumName", "FTselectMeth","FTnumTotal","FTnumSamp","FTselProb","FTincProb")]) unique(myH1DataObject_uptde[["FO"]][,c("FOid","FOstratification","FOstratumName", "FOselectMeth","FOnumTotal","FOnumSamp","FOselProb","FOincProb")]) unique(myH1DataObject_uptde[["BV"]][,c("BVid","BVfishId","BVselectMeth","BVnumTotal", "BVnumSamp","BVselProb","BVincProb")])
We could use probType = "selection"
to further complete the data with selection probabilities. However, selection method in table FT is SRSWOR and so the applyGenerateProbs
issues an error (see ?applyGenerateProbs for more details)
myH1DataObject_uptde<-applyGenerateProbs (x = myH1DataObject_uptde , probType = "selection" , overwrite=T , runInitialProbChecks = FALSE)
overwrite
argumentTo be completed
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.