c-MoosePopR and SightabilityPopR Demonstration
In SightabilityModel: Wildlife Sightability Modeling

knitr::opts_chunk$set(echo = TRUE)
options(width=200)

library(car)
library(ggplot2)
library(kableExtra)
library(plyr)
library(readxl)
library(reshape2)
library(SightabilityModel)

Availability of code

The code for this vignette is available in the GitHub repository noted in the Description file of this package.

Introduction

This document illustrates how to use the MoosePopR() and SightabilityPopR() functions to replace the MoosePop (Reed, 1989) and AerialSurvey (Unsworth, 1999) programs. The results from a 2018 survey of moose conducted according to methods described in RISC (2002) will be used for illustrative purposes.

Survey protocol

The survey area is divided in blocks (which are the primary sampling units, PSUs). Blocks could be defined using a grid-system, or based on ecological factors such as elevation, type of vegetation, etc. The blocks do not have the same area, nor do they have to rectangular. If blocks are of different size and the area of the block is not used in the analysis, no biases are introduced into the estimation procedure, but the survey design could be inefficient, i.e., the uncertainty in the estimates could be larger compared to the standard error obtained from methods that account for different block area (see later in this document).

Blocks are then stratified using a habitat model into categories of relative moose densities. In this case, the blocks are stratified into three strata corresponding to low, medium, and high density depending on past years results, vegetation cover, and other attributes. There is no limit on the number of strata that can be defined, but usually three to five blocks would be sufficient. Blocks within strata should have similar densities and strata should "as different" as possible. If the stratification is incorrect (e.g., a block is classified into the high stratum when in fact is should be in the low stratum), no biases are introduced into the estimates

A simple random sample of blocks is selected from each stratum, i.e., each block is selected with equal probability within each stratum. The number of block to select from each stratum is determined using different allocation methods. The analysis of the completed survey does not depend on how the sample size was determined in each stratum. Some selected blocks may not be surveyed due to bad weather etc. These missed blocks are assumed to be missing completely at random (MCAR) within each stratum, i.e., the missingness is unrelated to the response or any other covariate. If a missed block is not replaced by another block chosen at random from the stratum, the sample size in this stratum will be reduced. The impact of a reduction in sample size in each stratum (with increased uncertainty in the estimates for each stratum compared to a survey with no missing blocks), but no biases are introduced.

When a block is surveyed, moose are observed in groups. When a group is identified, a count of the number of moose is made separated by age (adult, juvenile, calves) and sex (bull or cow).

Not all groups of moose can be observed due to sightability issues. A previous sightability study (e.g., Quayle et al. 2001; RISC, 2002) is used to estimate the probability of detecting a group of moose based on a categorical vegetation cover variable with five classes. NOTE that sightability operates at the group level and not at the moose level. This means that if a group is sighted, then all members of the group are enumerated. But entire groups could be "invisible" due to vegetation cover.

This is a stratified random sampling design where the sampling unit is the block surveyed in each stratum and the observational unit is a group of moose. The observations on groups are pseudo-replicates and must be treated as observations within a cluster (being the block).

This document will illustrate how to estimate the total number and density of moose (in various classes) and the ratio of types of moose (e.g., bulls to cow ratio) without and with correcting for sightability. The estimates are compared to those generated by the MoosePop program (Reed, 1989; no correction for sightability) and the AerialSurvey program (Unsworth, 1999; correction for sightability).

Data input

Data input consists of several datasets placed into an Excel workbook using different worksheets for the different information.

Moose data

This consists of information on group of moose observed in a selected blocks. The data should have a header row with the variable names for each column.

# Get the actual survey data
dir(system.file("extdata", package = "SightabilityModel"))

survey.data <- readxl::read_excel(system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
                                  sheet="BlockData",
                                  skip=1,.name_repair="universal")
# Skip =xx says to skip xx lines at the top of the worksheet. In this case skip=1 implies start reading in line 2 of
# the worksheet.
# .name_repair="universal" changes all variable names to be compatible with R, e.g., no spaces, no special characters, etc

Variable names for the counts are somewhat arbitrary. Notice how R converts the column names from the Excel workbook to to proper R variable names (.name_repair="universal" controls this conversion. These (modified) variable names will be used to identify the quantities to be estimated so simpler names are easier to use than more complicated names.

# Check the variable names in the input data
names(survey.data)

Missing values in the count fields are assume to be zeros and a zero value must be substituted for the missing values.

# convert all NA in moose counts to 0
survey.data$Bulls          [ is.na(survey.data$Bulls)         ] <- 0
survey.data$Lone.Cows      [ is.na(survey.data$Lone.Cows)     ] <- 0
survey.data$Cow.W.1...calf [ is.na(survey.data$Cow.W.1...calf)] <- 0
survey.data$Cow.W.2.calves [ is.na(survey.data$Cow.W.2.calves)] <- 0
survey.data$Lone...calf    [ is.na(survey.data$Lone...calf)   ] <- 0
survey.data$Unk.Age.Sex    [ is.na(survey.data$Unk.Age.Sex)   ] <- 0
#head(survey.data)

We check that the NMoose is computed properly by comparing the recorded total number of moose with a computed total number of moose.

# check the total moose read in vs computed number
survey.data$myNMoose <- survey.data$Bulls +
                       survey.data$Lone.Cows+
                       survey.data$Cow.W.1...calf*2+
                       survey.data$Cow.W.2.calves*3+
                       survey.data$Lone...calf+
                       survey.data$Unk.Age.Sex

ggplot(data=survey.data, aes(x=myNMoose, y=NMoose))+
  ggtitle("Compare my count vs. count on spreadsheet")+
  geom_point(position=position_jitter(h=.1,w=.1))+
  geom_abline(intercept=0, slope=1)

We count the number of observations by sub-unit and stratum to ensure that Stratum has been coded properly. For example ensure that

stratum case is consistent (e.g., h vs. H),
stratum codes are consistent (e.g., H vs. HIGH).

xtabs(~Stratum, data=survey.data, exclude=NULL, na.action=na.pass)

We can also add additional computed variables. For example, the number of cows is found as

survey.data$Cows <- survey.data$Lone.Cows + survey.data$Cow.W.1...calf + survey.data$Cow.W.2.calves

The first few records of the counts of interest are:

head(as.data.frame(survey.data[,c("Block.ID","Stratum","Bulls","Lone.Cows","Cow.W.1...calf",
                    "Cow.W.2.calves","Cows","Lone...calf","Unk.Age.Sex","NMoose")]))

Block area

The area of each surveyed block is also needed.

# Get the area of each block
survey.block.area <- readxl::read_excel(
                        system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
                        sheet="BlockArea",
                        skip=1,.name_repair="universal")
head(survey.block.area)

The name of the variable identifying the block should match the name used in the raw data from the previous section.

names(survey.block.area)
names(survey.data)

Every block that is surveyed should have an area, but the block area file can have areas for blocks not surveyed (e.g., every block in the population of blocks). If a area is available for a block that was not surveyed, it is ignored.

# Check that every surveyed block has an area. The setdiff() should return null.
setdiff(survey.data$Block.ID, survey.block.area$Block.ID)

# It is ok if the block area file has areas for blocks not surveyed
setdiff(survey.block.area$Block.ID, survey.data$Block.ID)

Stratum information

Information about each stratum such as the total area and total number of blocks is required

stratum.data <- readxl::read_excel(
      system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
      sheet="Stratum",
      skip=2,.name_repair="universal")
stratum.data

Ensure that the stratum labels match those used in the raw data

Sightability model.

A logistic regression comparing observed groups of moose to known groups of moose (e.g., based on radio collared moose) was used to estimate the sightability model (e.g., Quayle, 2001).

Extracting from AerialSurvey

The sightability model used was extracted from the AerialSurvey (Unsworth, 1999) program input files and entered on the worksheet in the Excel workbook. The AerialSurvey program uses two files to describe the survey (and computations) and the model for the sightability.

The Moose.mdf is a text-file that describes the model used for sightability corrections. A portion of the file is shown below

;==============================================================================
; MOOSE.MDF - Model description file (MDF) for Aerial Survey - 7-Oct-94
; -comments start with a semicolon (any text after it will be ignored)
; -blank value fields (right hand side) will default to nil, zero, or no!
;==============================================================================
[Model]
Title    = "Moose, Bell JetRanger"
Subtitle = '<Uses only the % veg cover>' ; optional
Species  = Moose
Aircraft = "Bell JetRanger"
Locale   = xxxxxx
Version  = 1.00
Terms    = 2           ; number of terms in the model, max = 10
Date     = 04.11.00

This appears to be the model for Bell JetRanger aircraft for moose. The model for sightability appears to be function only of vegetation cover with two term (intercept and vegetation cover class).

Vegetation cover appears to be divided into 5 classes:

[Transformations]
;Type of transformations can be:
;  Class, LookupT1, LookupT2, LookupT3, LookupT4, Table1, Table2, Power, None
Number = 1  ; number of transformations, max = 5
; for each transformation, give the name/index of the variable,
; and the type of transformation required on the variable
; TransformX = variable, type
Transform1 = VegCover, Class

[Class]  ; Class - e.g., vegetation cover classes into intervals
; for each interval give lower and upper limits and the assigned value
Intervals = 6  ; number of class intervals, max = 9
; ClassX = >lower, <=upper, value - ascending, sequential order
Class1 = -0.1,  20.0, 1.0
Class2 = 20.0,  40.0, 2.0
Class3 = 40.0,  60.0, 3.0
Class4 = 60.0,  80.0, 4.0
Class5 = 80.0,  100.0, 5.0

For example, if vegetation cover is from 0 to 20%, this is class 1; from 20%+ to 40% is class 2; etc.

The file also contains the estimated model coefficients and estimated covariance matrix for the sightability model..

; Map input variables into X's
[X] ; label/index from a list of variables
X01 = 1.0       ; constant
X02 = VegCover  ; class transformation

;------------------------------------------------------------------------------
; Model coefficients
;------------------------------------------------------------------------------
[Beta]   ; beta vector
Beta01 =  4.9604    ; constant
Beta02 = -1.8437    ; vegetation cover

[Sigma]  ; sigma matrix
Sigma01 =  0.7822
Sigma02 = -0.2820,  0.1115

The estimated probability of sighting a moose is $$logit(p)=4.9604 -1.8437(vegetation~class)$$ Notice that vegetation class is treated as continuous variable and used directly with values of 1 to 5 rather than as 5 indicator variables. This implies that the $logit(p)$ increases linearly at the same rate as the vegetation cover class goes from 1 to 2 or from 2 to 3, etc. In some cases, sightability may depend on a categorical variable (e.g., north or south aspect). Then a series of indicator variables will have to be created, e.g AspectNorth takes the value of 1 if the aspect is north and the value of 0 if the aspect is south. If the categorical variable has $K$ classes, you will need $K-1$ indicator variables. Contact me for help on this issue.

Reading coefficients of the model

The information from the AerialSurvey (Unsworth, 1999) program was transferred to the Excel workbook containing the survey information for this study.

The estimated coefficients of the model and the covariance matrix (sigma matrix) of the estimated coefficients are input

# number of beta coefficients
nbeta <- readxl::read_excel(
   system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
   sheet="SightabilityModel",
   range="B2", col_names=FALSE)[1,1,drop=TRUE]
# Here Range="B2" refers to a SINGLE cell (B2) on the spreadsheet

# extract the names of the terms of the model
beta.terms <- unlist(readxl::read_excel(
     system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
     sheet="SightabilityModel",
     range=paste0("B3:",letters[1+nbeta],"3"), 
     col_names=FALSE, col_type="text")[1,,drop=TRUE], use.names=FALSE)
# Here the range is B3: C3 in the case of nbeta=2 etc
cat("Names of the variables used in the sightability model:", beta.terms, "\n")

# extract the beta coefficients
beta <- unlist(readxl::read_excel(
    system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
    sheet="SightabilityModel",
    range=paste0("B4:",letters[1+nbeta],"4"), 
    col_names=FALSE, col_type="numeric")[1,,drop=TRUE], use.names=FALSE)
# Here the range is B4: C4 in the case of nbeta=2 etc
cat("Beta coefficients used in the sightability model:", beta, "\n")

# extract the beta covariance matrix
beta.cov <- matrix(unlist(readxl::read_excel(
    system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
    sheet="SightabilityModel",
    range=paste0("B5:",letters[1+nbeta],4+nbeta), 
    col_names=FALSE, col_type="numeric"), use.names=FALSE), ncol=nbeta, nrow=nbeta)
cat("Beta covariance matrix used in the sightability model:\n")
beta.cov

These values match those in the AerialSurvey (Unsworth, 1999) package.

The estimated sightability and expansion factors at the group level (inverse of sightability) are:

Cover <- data.frame(intercept=1, VegCoverClass=1:5)
Cover$logit.p <- as.matrix(Cover[,1:nbeta ]) %*% beta
Cover$logit.p.se <- diag(sqrt( as.matrix(Cover[,1:nbeta ]) %*% beta.cov %*% t(as.matrix(Cover[,1:nbeta ]))))
Cover$p       <- 1/(1+exp(-Cover$logit.p))
Cover$p.se    <- Cover$logit.p.se * Cover$p * (1-Cover$p)
Cover$expansion <- 1/Cover$p
Cover$expansion.se <- Cover$p.se / Cover$p^2

temp <- Cover[, -1]
temp[,2:7]<- round(temp[,2:7],3)
#temp

kable(temp, row.names=FALSE,
      caption="Estimated sightability of groups and expansion by Vegetation Cover Class",
      col.names=c("Vegetation Cover Class","logit(p)","SE logit(p)","p","SE(p)","Expansion Factor","SE(EF)")) %>%
      column_spec(column=c(1:7),         width="2cm") %>%
      kable_styling("bordered",position = "center", full_width=FALSE)

So a single moose sighted in Vegetation Cover Class 5 is expanded to almost 72 moose (!).

Creation of VegCoverClass variable

We need to compute a VegCoverClass variable in the survey data. The recode() function from the car package is used.

# convert the Veg Cover to Veg.Cover.Class

xtabs(~..Veg.Cover, data=survey.data, exclude=NULL, na.action=na.pass)
survey.data$VegCoverClass <- car::recode(survey.data$..Veg.Cover,
                                " lo:20=1; 20:40=2; 40:60=3; 60:80=4; 80:100=5")
xtabs(~VegCoverClass+..Veg.Cover, data=survey.data, exclude=NULL, na.action=na.pass)

Notice that there are 8 groups of moose where the Vegetation Cover was not recorded. This has implications for estimation as noted later. Generally speaking, you should not simply delete these observations because this will introduce substantial bias in the estimates. Rather, impute a vegetation cover class that gives the highest sightability for these groups to give a lower bound on the number of moose in this block. This will reduce (but not eliminate) the bias caused by simply dropping these observations.

Key variables.

There are several key variables used to match records across the raw data, block area, and stratum data frames. The default variable names are:

Block.ID representing the block identification number. The Block.ID should not be duplicated in different strata. It can be numeric or characters.
Block.Area representing the area of each surveyed block. This should be numeric.
Stratum representing the stratum identification. It can be character or numeric does not have to be a single character.
Stratum.Blocks representing the total number of blocks in each stratum.
Stratum.Area representing the total area of each stratum in the same units as the Block.Area.

The analysis function has arguments that can be changed if the names of these key variables differ from above.

Dealing with missing and zero values

Group information

For each group sighted, the number of animals is recorded and may be divided by age and/or sex groupings. If no animals of a particular sex/age are seen in a group, then it should be recorded as zero rather than as missing.

In some cases, sex/age information is not available for an animal. It should be counted in a separate category for unknown sexed/aged animals. Of course the, a naive application of estimating the number of bulls may be biased downwards is some bulls are recorded in the unknown sexed/aged category; however, estimates of the total number moose will be fine.

No missing values are allowed in any count.

Covariate information

The sightability models use covariates at the group level to adjust for detection less than 100%. No missing values are allowed for the covariates. If the covariates are not available, then you should not simply delete the group, but rather impute a value of the covariate that gives the highest sightability to form a lower bound on the number of moose in the block because this particular block is not fully corrected for sightability.

There are two sources of bias that could occur when you drop a group with missing sightability covariate values:

If you delete the group and the block is now empty (and so vanishes), then your sample size (number of blocks surveyed) and area surveyed are too small and your total observed animals is too small. The bias could be positive or negative depending if the deleted groups are large or small.
If the number of blocks does not change (i.e., deleting this group does make the block empty), then deleting the group makes the total count for this block too small. If you keep the block with sightability set to 1, you partially correct against dropping the block, because rather than having a block correction factor of 2 (e.g., some lack of sightability due to high vegetation cover), you are using a sightability correction factor of 1.

Trying to predict the combined effect of these sources of bias is not easy. The effect of the second item is to partially correct the abundance/density estimate compared to dropping the block.

See the example in the Sightability section below on dealing with missing covariates.

Blocks with no observed groups.

It is possible that a block will be surveyed and NO animals seen in the entire block. In this case, you should insert a "dummy" observation for this block with counts of 0 for the number of moose to ensure that this surveyed block (with no animals seen) is properly accounted for. You can set the sightability variables to any value, because $0 \times \textrm{sightability correction}$ is still 0.

Block information

Each block must have a known area. Missing values are not allowed.

A single block in a stratum

If a stratum has only a single block surveyed, then no design based estimate of variance is possible. This is known as the lonely primarily sampling unit (lonely PSU). Ad hoc adjustments are possible; refer to http://r-survey.r-forge.r-project.org/survey/exmample-lonely.html for some suggestions.

The MoosePopR() function has an argument that can be set using

MoosePopR(...,  survey.lonely.psu="remove")

The the variance for this stratum is not considered when estimating the variance of the grand totals or density. Unfortunately, it is not possible, at this time, to estimate a ratio with some strata having lonely PSU.

The SightabilityPopR() function can deal with lonely PSUs by using within block variability (e.g., among groups) so this will provide a lower bound to the true uncertainty.

The bottom line is avoid designs where you sample only a single block in a stratum!

Stratum information

Each stratum must have the known total area for the MoosePop() function and the known total number of blocks for the SightabilityPopR() function.

Analysis

Analysis not adjusting for sightability

Within a stratum

Density

Suppose a sample of $n_h$ blocks were surveyed in stratum $h$. For each block, the number of moose $m_{hi}$ and the area of the block $a_{hi}$ are obtained for block $i$ in stratum $h$. The estimated density of moose per unit area in stratum $h$ is obtained as: $$\widehat{D}h = \frac{\sum_i m{hi}}{\sum_i a_{hi}}$$ i.e., as the total number of moose in the surveyed blocks divided by the total survey area. This is a standard ratio estimator and the estimated standard error is found as using equation 6.9 of Cochran (1977). The estimator and its standard error can be automatically computed using the svyratio() function in the survey package (Lumley, 2019) of R.

Similar formula are used for estimating the density of other sex/age categories.

Abundance

The estimated total number of moose in stratum $h$ ($\widehat{M}_h$) is found by multiplying the density estimate and its standard error by the stratum area $A_h$

$$\widehat{M}_h = \widehat{D}_h \times A_h$$ $$se(\widehat{M}_h) = se(\widehat{D}_h) \times A_h$$ Similar formula are used for estimating the abundance of other sex/age categories.

Bull to Cow and other ratios

Suppose a sample of $n_h$ blocks were surveyed in stratum $h$. For each block, the number of bulls $b_{hi}$ and the number of cows $c_{hi}$ are obtained for block $i$ in stratum $h$. The estimated bull to cow ratio in stratum $h$ is obtained as: $$\widehat{BC}h = \frac{\sum_i b{hi}}{\sum_i c_{hi}}$$ i.e., as the total number of bulls ($\sum_i b_{hi}$) in the surveyed blocks divided by the total number of cows ($\sum_i c_{hi}$) in the surveyed blocks. This is a standard ratio estimator and the estimated standard error is found as using equation 6.9 of Cochran (1977). The estimator and its standard error are automatically computed using the svyratio() function in the survey package (Lumley, 2019) of R.

If the number of bulls per 100 cows is wanted, simply multiply the estimate and its standard error by $100\times$.

Other ratios such as calves:cows are computed in similar ways.

Across all strata

Density

The overall density for the number of moose is found as a weighted averaged of the stratum specific densities: $$\widehat{D}= \sum_h \frac{A_h}{\sum_h A_h}\widehat{D}_h$$

$$se(\widehat{D}) = \sqrt{\sum_h \left( \frac{A_h}{\sum_h A_h} \right)^2se(\widehat{D}_h)^2}$$

which is equivalent to the estimated total abundance (see below) divided by the total stratum area. This is known as a separate-ratio estimator of the total and is possible because the expansion factor going from density to abundance (the stratum area) is known for each stratum.

Similar formula are used for estimating the density of other sex/age categories.

Abundance

The total abundance of moose is found by summing the estimated abundances across strata:

$$\widehat{M} = \sum_h \widehat{M}_h$$ and the standard error of the overall abundance is found as: $$se(\widehat{M}) = \sqrt{\sum_h se(\widehat{M}_h)^2}$$ This is known as a separate-ratio estimator of the total and is possible because the expansion factor going from density to abundance (the stratum area) is known for each stratum.

Similar formula are used for estimating the abundance of other sex/age categories.

Bull to Cow and other ratios

The estimation of the overall bull to cow ratio is computed using the double-ratio estimate (Cochran, 1977, Section 6.19) as $$\widehat{BC}_{\textit{double~ratio}}=\frac{\widehat{B}}{\widehat{C}}$$ where $\widehat{B}$ and $\widehat{C}$ are the estimated total number of bull and cows individually computed. This is equivalent to the ratio of the overall density of bulls and cows. The above formula can be expanded to give

$$\widehat{BC}{\textit{double~ratio}}=\frac{\sum_h \frac{\sum_i b{hi}}{\sum_i a_{hi}} \times A_h} {\sum_h \frac{\sum_i c_{hi}}{\sum_i a_{hi}} \times A_h}$$

Its standard error is NOT directly available in the survey package but is readily computed by finding the estimates and the covariance of the estimates of number of bull and cows for each stratum and then using the delta method to find the overall ratio and standard error of the overall ratio.

The MoosePopR() function.

A MoosePopR() function was created in R to replicate the results from MoosePop (Reed, 1989). The key arguments to this function are:

survey.data - information on each group measured in the survey. This was read from the Excel workbook earlier.
survey.block.area - information about the area of each block surveyed. This was read from the Excel workbook earlier.
stratum.data - information about each stratum. This was read from the Excel workbook earlier.
density, abundance, numerator, denominator - variables for which the density, abundance, ratio are to be formed as right-sided formula (e.g., density=~Cows to estimate the density of Cows).
conf.level=0.90 indicates the confidence level for confidence intervals to compute with the default being a 90% confidence interval.

Sample Analyses

Results differ slightly from those reported on an Excel spreadsheet because of slight differences in areas of blocks between data provided to me and that used in MoosePop (Reed, 1989).

We are now ready to do the analysis. I've gathered all of the analysis results into a list object for extraction later.

The UC suffix that I have placed on the estimate name indicates it is not corrected for sightability.

I have only computed abundance and density estimates for all moose, but similar code can be used for other age/sex categories. Similarly, I have only considered the bull:cow ratio and other ratio can be computed in similar fashion.

Density of all moose

We compute the (uncorrected for sightability) density of all moose (regardless of sex or age) using the number of moose (NMoose) variable.

result <- NULL
result$"All.Density.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~NMoose                         # which density
)
result$"All.Density.UC"

The estimate and SE column as the estimated (uncorrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall density is found as a weighted average (by stratum area) of the density in each stratum.

Similar function calls are used for estimating the density of other sex/age categories by changing the density argument.

Abundance of all moose

We compute the estimated (uncorrected for sightability) abundance of all moose (regardless of sex or age) by expanding the density by the Stratum.Area. The overall abundance is found as the sum of the expanded estimates.

result$"All.Abund.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # which density
)
result$"All.Abund.UC"

The estimate and SE column as the estimated (uncorrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall abundance is found as the sum of the estimates in each stratum.

Similar function calls are used for estimating the abundance of other sex/age categories by changing the density argument.

Number of bulls and number of cows

We estimate the (uncorrected for sightability) abundance of Bulls and Cows.

result$"Bulls.Abund.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~Bulls                      # which density
)
result$"Bulls.Abund.UC" 

result$"Cows.Abund.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~Cows                      # which density
)  
result$"Cows.Abund.UC"

These values are computed similarly as the total abundance.

Bull to Cow and other ratios

Finally, we estimate the Bulls:Cows ratio:

result$"Bulls/Cow.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,numerator=~Bulls, denominator=~Cows    # which density
) 
result$"Bulls/Cow.UC"

Other ratios would be estimated in similar ways by changing the numerator and denominator arguments to the function call.

Final object

names(result)

Adjusting for sightability

The above computations are not adjusted for sightability, i.e., not all moose can be detected due to visibility concerns.

An adjustment for sightability can be done by also performing a study where a known number of groups of moose (e.g., radio collared) are surveyed and information about detection/non detection is recorded at the group level. A logistic regression (see previous sections) is used to estimate the probability of detection of a group given covariates (such as Vegetation Class).

Then the sightability model is used to expand the observed number of groups of moose by the a sightability correction factor (SCF). This is related to the inverse of the probability of detection, but includes a correction to account for the uncertainty in the estimates of the coefficients of the sightability model (see Fieberg, 2012, Equation 1 and later in this document).

For example, consider a model to estimate detectability given in Quayle et al. (2001):

sightability.model <- ~VegCoverClass
sightability.beta  <-  c(4.2138, -1.5847)
sightability.beta.cov <- matrix(c(0.78216336,   -0.282, -0.282, 0.11148921), nrow=2, ncol=2)


sightability.table <- data.frame(VegCoverClass=1:5,
                                 VegPercent=c("00-20","21-40","41-60","61-80","81-100"))
sightability.table$detect.prob <- SightabilityModel::compute.detect.prob(sightability.table, 
                                                      sightability.model, 
                                                      sightability.beta, 
                                                      sightability.beta.cov)

sightability.table$SCF <- SightabilityModel::compute.SCF(sightability.table, 
                                                      sightability.model, 
                                                      sightability.beta, 
                                                      sightability.beta.cov)


kable(sightability.table, row.names=FALSE,
      caption="Estimated sightability correction factor for each vegetation cover class",
      col.names=c("Veg Cover Class","Veg Cover %","Detection probability","Sightability Correction Factor"),
      digits=c(0,0, 3,2)) %>%
      kable_styling("bordered",position = "center", full_width=FALSE, latex_options = "HOLD_position")

Notice that that, for example, that 1/r sightability.table$prob.detect[2]=r 1/sightability.table$prob.detect[2] is only approximately equal to r sightability.table$SCF[2].

It is assumed that all animals in a group (e.g., bulls, cows, calves) are observed if a group is detected.

Let $\theta_{hjk}$ represent the expansion factor for group $k$ in block $j$ of stratum $h$. In the analyses that follow the block area is NOT used in the analysis. In its place, a simple expansion based on the number of blocks surveyed within a stratum are used. This will be less efficient (i.e., have a larger standard error) compared to an analysis that uses the individual block areas, but unless there is a high correlation between the number of animals and the block area, the loss of efficiency is small. If all blocks had the same area, the two approaches are identical.

We examined the correlation for the existing survey data and it is typically small. Cochran (1977) shows that unless the correlation is larger than a specified cutoff that depends on the coefficient of variation of the numerator and denominator variables, then there is no gain in using the block area, i.e., unless $$\rho > \frac{1}{2}\frac{SD(denominator~variable)/mean(denominator~variable)}{SD(numerator~variable)/mean(numerator~variable)} $$ where $\rho$ is the correlation between the numerator and denominator variable (i.e., between abundance of the numerator variable and block area)$, then a ratio estimator using the area of blocks is not more efficient.

# Look at correlations between total number of moose, bulls, and cows and block area
survey.block.data <- plyr::ddply(survey.data, 
                                 c("Block.ID","Stratum"),
                                 plyr::summarize,
                                 Bulls          = sum(Bulls,           na.rm=TRUE),
                                 Lone.cows      = sum(Lone.Cows,       na.rm=TRUE),
                                 Cow.W.1...calf = sum(Cow.W.1...calf,  na.rm=TRUE),
                                 Cow.W.2.calves = sum(Cow.W.2.calves,  na.rm=TRUE),
                                 Lone...calf    = sum(Lone...calf,     na.rm=TRUE),
                                 Unk.Age.Sex    = sum(Unk.Age.Sex,     na.rm=TRUE),
                                 Cows           = sum(Cows,            na.rm=TRUE),
                                 NMoose         = sum(NMoose,          na.rm=TRUE))
# add the area to the block totals
survey.block.data <- merge(survey.block.data, survey.block.area, all.x=TRUE)

# What is correlation between block area and number of moose etc
survey.block.data.melt <- reshape2::melt(survey.block.data,
                        id.vars=c("Stratum","Block.ID","Block.Area"),
                        measure.vars=c("Bulls","Lone.cows","Cow.W.1...calf","Cow.W.2.calves",
                                       "Lone...calf","Unk.Age.Sex","Cows","NMoose"),
                        variable.name="Classification",
                        value.name="N.Animals")

# find correlation between number of animals and area
res <- plyr::ddply(survey.block.data.melt, c("Stratum","Classification"), plyr::summarize,
                   corr=cor(N.Animals, Block.Area),
                   cv.N.Animals=sd(N.Animals)/mean(N.Animals),
                   cv.Area     =sd(Block.Area)/mean(Block.Area),
                   cut.off      = cv.Area/2/cv.N.Animals)

temp <- res[,c(1,2,3,6)]
temp[,3:4] <- round(temp[,3:4],2)
kable(temp, row.names=FALSE,
      caption="Estimated correlation between animal counts and block area",
      col.names=c("Stratum","Type of Animal","Corr","Cutoff")) %>%
      column_spec(column=c(1:2),         width="3cm") %>%
      column_spec(column=c(3:4),         width="1cm") %>%
      kable_styling("bordered",position = "center", full_width=FALSE) %>%
      footnote(threeparttable=TRUE,
               general=paste0("Cutoff is 0.5 ratio of sd/mean of area and abundance of each type of animal",
                              " and unless the correlation exceeds the cutoff, there is no advantage",
                              " in using the block area in the analysis"))

The correlations are not high. While about half of the response variables have correlations of abundance with block area that exceed the cutoff, the variation in the block areas is not that large, so we don't think that also adjusting for block area is necessary. Of course, this also does not account for sightability issues.

A plot of the two variables for each types of animals shows no strong relationship between the number of observed animals and the block area:

ggplot(data=survey.block.data.melt, aes(x=Block.Area, y=N.Animals, color=Stratum))+
  geom_point()+
  facet_wrap(~Classification, ncol=3, scales="free")+
  theme(legend.position=c(1,0), legend.justification=c(1,0))

Consequently, it not expected that ignoring block area in the analysis will have a material effect.

Within a stratum

Abundance

Suppose a sample of $n_h$ blocks were surveyed in stratum $h$ out of $N_h$ total number of blocks. For each block, the number of moose $m_{hjk}$ are obtained for group $k$ in block $j$ in stratum $h$. The estimated correction factor (for visibility of the group) is $\widehat{\theta}{hjk}$ which is computed give the vector of covariates (x{hjk}) as (see page 4 of Fieberg, 2012)

$$\widehat{\theta}{hjk}= 1+\exp{(-x{hjk}^t \widehat{\beta}-x_{hjk}^t \Sigma x_{hjk} / 2)}$$

Notice that the sightability correction factor is not simply $1/\textit{probability of detection}$ or $1+\exp{(-x_{hjk}^t \widehat{\beta}$ because it includes a correction factor to account for the non-linear transformation from the logit to the probability scale.

The estimated abundance of moose in stratum $h$ is obtained as: $$\widehat{M}h = N_h \times \frac{\sum{jk} m_{hjk}\widehat{\theta}_{hjk}}{n_h}$$ i.e., as the mean number of (sightability corrected) moose per surveyed block times the total number of blocks. The standard error of this estimate is complicated because of the need to account for the estimated sightability but the theory is covered in Steinhorst and Samuel (1989), Fieberg (2012) and Wong (1996) and implemented in the AerialSurvey (Unsworth, 1999) and SightabilityModel packages of R. The latter package corrects the variance computations used by Steinhorst and Samuel (1989) and Fieberg (2012).

Similar computations are used for estimating the abundance of other sex/age categories.

Density

The estimated density of moose in stratum $h$ ($\widehat{D}_h$) is found by dividing the abundance estimate and its standard error by the stratum area $A_h$

$$\widehat{D}_h = \frac{\widehat{M{_h}}}{A_h} $$ $$se(\widehat{M}_h) = \frac{se(\widehat{M}_h)}{A_h}$$

Similar computations are used for estimating the density of other sex/age categories.

Bull to Cow and other ratios

The ratio of bulls to cows within a stratum is computed as the estimated abundances in each stratum: $$\widehat{BC}_h = \frac{\widehat{Bulls}_h}{\widehat{Cows}_h}$$ i.e., as the estimated total number of bulls in the stratum divided by the estimated total number of cows in the stratum. This ratio estimator is not available in the SightabilityModel package (Fieberg, 2012), but a new function was written and added to the package to add this functionality. The standard error is difficult to compute because of the need to account for the correlation between the number of bulls and moose in individual groups and the the uncertainty of the correction factors but is given in Wong (1996).

Note that if the sightability is equal for all groups, then this common sightability correction factor would cancel everywhere and you would get the same results as the ratio estimator based on the observed groups only. In most cases, the impact of sightability will be small on the ratio estimator and its standard error.

If the number of bulls per 100 cows is wanted, simply multiply the estimate and its standard error by $100\times$.

Similar computations occur for other ratios such as calves:cows.

Across all strata

Abundance

The total abundance of moose is found by summing the estimated abundances across strata:

$$\widehat{M} = \sum_h \widehat{M}_h$$ The standard error is difficult to compute because the same sightability model is used across strata. This has been implemented in the SightabilityModel package (Fieberg, 2012) and the AerialSurvey (Unsworth, 1999) program.

Similar computations are used for estimating the abundance of other sex/age categories.

Density

The overall density of moose and its standard error is found as estimated total abundance and its standard error divided by the total area over all strata.. $$\widehat{D}= \frac{ \widehat{M}}{\sum_h A_h}$$ $$se(\widehat{D}) = \frac{se(\widehat{M})}{{\sum_h A_h}}$$ Because the total area of all strata is known quantity, this is straightforward.

Similar computations are used for estimating the density of other sex/age categories.

Bull to Cow and other ratios

The estimation of the overall bull to cow ratio is computed as the ratios of their respective estimated abundances $$\widehat{BC}=\frac{\widehat{Bulls}}{\widehat{Cows}}$$ This ratio estimator was not available in the SightabilityModel package (Fieberg, 2012), but its functionality has been added as part of this contract. Standard errors are computed as described in Wong (1996).

Similar computations can be used for other ratios such as calves:cows.

The SightabilityPopR() function

I have created a function with the same calling arguments as the MoosePopR() function to estimate abundance, density, and ratios using the SightabilityModel package. Notice that additional functionality was added to the SightabilityModel package (Fieberg, 2012) to estimate ratios.

A SightabilityPopR() function has a similar calling sequence as the MoosePopR function with the same key input variables.

survey.data - information on each group measured in the survey. This was read from the Excel workbook earlier.
survey.block.area - information about the area of each block surveyed. This was read from the Excel workbook earlier.
stratum.data - information about each stratum. This was read from the Excel workbook earlier.
density, abundance, numerator, denominator - variables for which the density, abundance, ratio are to be formed as right-sided formula (e.g., density=~Cows to estimate the density of Cows).
conf.level=0.90 indicates the confidence level for confidence intervals to compute with the default being a 90% confidence interval.

Additionally, the sightability formula, the beta coefficients and the estimated covariance matrix must also be specified using the arguments:

sight.formula - the formula for the sightability model. This should use the variables in the survey.data frame
sight.beta - the beta coefficients for the sightability model
sight.beta.cov - the covariance matrix for the beta coefficients
sight.logCI - should confidence intervals for abundance be computed using a logarithmic transformation
sight.var.method - what method (either "Wong" or "SS") should be used to estimate the variances . using the methods of Wong (1996) or Steinhorst and Samuels (2012). The default (and preferred) method is "Wong" because she corrects the variance computations used by Steinhorst and Samuel (1989) and Fieberg (2012).

These additional parameters are passed to the appropriate functions in the SightabilityModel package (Fieberg, 2012). Refer to the examples for details.

Sample Analyses

Dealing with missing values in the sightability covariates

Because we are correcting for sightability, we need the VegCoverClass for all groups. However there are some groups of moose with missing data for this covariate:

select <- is.na(survey.data$VegCoverClass)
survey.data[select,]

If you simply delete these observations, then estimates can be severely biased because these groups would be excluded from the expansions when corrected for sightability. About the best that can be done is to assign an imputed vegetation class to these observations with the highest sightability to reduce the bias.

# change the missing values to cover class with highest sightability
survey.data$VegCoverClass[select] <- 1
survey.data[select,]

We are now ready to do the analysis. I've gathered all of the analysis results into a list object for extraction later.

The C suffix that I have placed on the estimate name indicates it is corrected for sightability.

Abundance of all moose

We compute the (corrected for sightability) abundance of all moose (regardless of sex or age) using the number of moose (NMoose) variable.

result <- NULL
result$"All.Abund.C" <- SightabilityPopR(
      survey.data        = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- result$"All.Abund.C"
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp

The estimate and SE column as the estimated (corrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall abundance is found as the sum of the estimates in each stratum. The right most columns are information about the sources of uncertainty (due to sampling, sightability, or the model for sightability).

Similar function calls are used for estimating the abundance of other sex/age categories by changing the abundance argument.

We can show the impact of simply deleting those values missing the VegCoverClass covariate by setting the observed values of moose to .001 for these groups. This will retain the correct number of blocks surveyed. If you simply deleted those groups of moose with missing VegCoverClass values and a block then had no observations, the block would "disappear" from the analysis which is not correct.

# mimic impact of deleting observations with missing VegCoverClass
survey.data2 <- survey.data
survey.data2$NMoose[select] <- .001
wrong.way <- SightabilityPopR( # show impact of deleting observations but keeping # blocks the same
      survey.data        = survey.data2         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
wrong.way

Notice the negative bias in the estimated abundance computed the incorrect way (r round(wrong.way[wrong.way$Stratum==".OVERALL","estimate"])) vs. the value when the VegCoverClass is set to the class with highest sightability (r round(result$"All.Abund.C"[result$"All.Abund.C"$Stratum==".OVERALL","estimate"])) computed above.

In the real world, if a group has missing values for covariates using in the sightability model, do not change the count data, but change the missing covariates to values that result in the highest possible sightability for this group. In this way, the number of blocks surveyed is computed properly and you have partially the bias in dropping these groups by including them with an imputed high sightability value.

Density of all moose

We compute the estimated (corrected for sightability) density of all moose (regardless of sex or age).

result$"All.Density.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp<- result$"All.Density.C"
temp[,-(1:3)] <- round(temp[,-(1:3)]+.00001,2)  # force to be numeric
temp

The estimate and SE column as the estimated (corrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall density is found as a estimated abundance/total area over all strata. The variance components are the same for the total abundance because these values are used to derive the density.

Similar function calls are used for estimating the density of other sex/age categories by changing the density argument.

Number of bulls and number of cows

We estimate the (corrected for sightability) abundance of Bulls and cows.

result$"Bulls.Abund.C" <- result$"All.Density.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~Bulls                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
result$"Bulls.Abund.C"

result$"Cows.Abund.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~Cows                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
result$"Cows.Abund.C"

These values are computed similarly as the total abundance.

Bull to Cow and other ratios

Finally we estimates the Bulls/Cow ratio using the corrected for sightability values.

result$"Bulls/Cow.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,numerator=~Bulls, denominator=~Cows                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)

temp<- result$"Bulls/Cow.C"
temp[,-(1:3)] <- round(temp[,-(1:3)]+.00001,3)  # force to be numeric
temp

The ratio is computed within each stratum and overall The sources of variation computations are set to missing as they are not sensible in this case.

Notice that this ratio estimator is not that different from that computed from values uncorrected for sightability. This is a consequence of the approximately equal sightability correction factor for all groups.

Other ratios can be computed by changing the numerator and denominator arguments in the function call.

Final object

The result is a list object with one element each of the analyses done above.

names(result)

Domain estimation

Domain estimation refer to estimating density, abundance, or ratios for subsets of the survey area. For example, in British Columbia, a survey may take place over a combined Buckley Valley and Lake District (sub units) survey area, but estimates of abundance are also wanted for Buckley Valley and Lake District individually. In this case, the sub-units are domains.

There are four cases to consider depending if information about the primary sampling units (PSU) (i.e., blocks) and stratum areas are known or unknown at the domain level $\times$ the domain separation operates at the PSU (block) or group level.

Domain at PSU (block) level; domain population information known

This is the simplest case. Here the primary sampling units (PSU) (or blocks) are classified as belonging to the domain or not. For example, the study area could be divided into two or more management units and each PSU (block) belongs to one and only one management units in its entirety. The total area of the management unit and the total number of potential PSUs is known for each management unit.

Because PSUs (blocks) were randomly selected within each stratum with equal probability, they are also randomly selected from each domain (management unit).

Consequently, just subset the survey data (observed animals) to those PSUs (blocks) belonging to the domain of interest and change the stratum information to refer to the total area and total PSUs (block) for that domain and estimation for density, abundance, and ratios occurs exactly as before.

Example

For example, the survey data used in the examples was classified in Domain $A$ and $B$, and a separate worksheet in the workbook has information about stratum information for Domain $A$. We subset the survey data, read in the stratum information for Domain $A$ and then use the same methods as shown before.

The PSUs (blocks) in the example are split between two domains (A and B) as shown below:

# PSU's are split between the domains
xtabs(~Domain+Block.ID, data=survey.data, exclude=NULL, na.action=na.pass)

The number of groups of moose in each block is shown.

We subset the survey data:

# Subset the survey data
survey.data.A <- survey.data[ survey.data$Domain == "A",]

The block information (block area) is unchanged and does not need to be subset.

We need information at the stratum level for domain A which is available on the workbook:

# Domain information for the population for each stratum
stratum.data.A <- readxl::read_excel(
      system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
      sheet="Stratum-DomainA",
      skip=2,.name_repair="universal")

kable(stratum.data.A, row.names=FALSE,
      caption="Stratum totals for Domain A",
      col.names=c("Stratum","Stratum Area","Stratum # blocks")) %>%
      column_spec(column=c(1:3),         width="2cm") %>%
      kable_styling("bordered",position = "center", full_width=FALSE)

We can now estimate abundance, density, or ratios for Domain A in the same way as before with and without corrections for sightability. We show only the estimated abundance for all moose after correcting for sightability. Note the use of survey.data.A and stratum.data.A*

All.Abund.A.corrected <- SightabilityPopR(
      survey.data       = survey.data.A         # raw data for domain A
      ,survey.block.area = survey.block.area    # area of each block
      ,stratum.data      = stratum.data.A       # stratum information for domain A
      ,abundance=~NMoose                        # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- All.Abund.A.corrected
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp[,1:10]

Domain at PSU (block) level; domain population information unknown.

In some cases, only the surveyed PSUs can be classified by domain and the stratum total area and stratum total blocks is unknown for the domain. Cochran (1977, Sections 2.13 and 5A.14) discuss this case.

Abundance and ratios can be estimated by setting all survey data to 0 for blocks/groups not part of the domain, and then using the estimating functions on the complete data after zeroing. This may seem strange, but consider the following table

                    PSU
Domain     A     A     A     A     B     B     B     B    B    B
Y          2     3     3     4     5     3     2     3    2    4

The total of $Y$ in Domain $A$ is $2+3+3+4=12$. If we consider the modified data

                     PSU
Domain     A     A     A     A     B     B     B     B    B    B
Y'         2     3     2     3     0     0     0     0    0    0

and find the mean of $\overline{Y'}=(2+3+3+4+0+0+0+0+0+0)/10=1.2$ and multiply the mean of Y' by the population number of PSU (10) we get the domain total of $Total(y)=\overline{Y'} \times 10=1.2 \times 10=12$.

Example

We set the survey data to 0 if the domain is not A (notice that we do NOT delete observations).

# Set number of moose to zero if not part of domain A
survey.data.A.z <- survey.data
survey.data.A.z$NMoose[ survey.data$Domain != "A"] <- 0

We now use the modified survey data (survey.data.A.Z but the original group and stratum data:

All.Abund.A.corrected.z <- SightabilityPopR(
      survey.data        = survey.data.A.z    # raw data with non-domain counts set to zero
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- All.Abund.A.corrected.z
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp[,1:10]

The estimates will be similar (but not exactly the same) between this analysis and the previous section.

Estimates of ratios can be found in the same way because these are based on abundances

However, estimates of density cannot be found using this method. Refer to Cochran (1977, Section 5A.14, Estimating Domain Means) for more details.

Domain at group level; domain population information known

The analysis in this case is similar to the comparable case when the domain occurs at the PSU (block) level. Now groups are classified as belonging to the domain or not. For example, interest may lie in the number of moose in each vegetation class.

All blocks are used, but now in each block, you will subset the groups that belong to the domain. The hard part is the area in this domain of EACH PSU (block) must also be known. Presumably, this is a GIS exercise in the case of vegetation cover, assuming that entire survey area has been mapped.

Similarly, the total area in the stratum for this domain must be known. This is presumably a GIS exercise.

The analysis proceeds by subsetting the survey data to groups that fall within this domain, but you need to ensure that groups without this domain are still included (with 0 counts for the response variable). You also need to use the modified block areas and modified stratum area in this domain. Please contact me if you are attempting to do this as the programming R is tricky, to say the least.

Domain at group level; domain population information unknown.

In some cases, only the surveyed groups can be classified by domain and the block area, stratum total area and stratum total blocks is unknown for the domain.

We use the same trick as above by setting the survey data to 0 if the domain is not A (notice that we do NOT delete observations).

Example

Suppose we want moose abundance by vegetation cover class. The estimate of total abundance is

All.Abund <- SightabilityPopR(
      survey.data        = survey.data        # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- All.Abund
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp[,1:10]

We will compute the total abundance for each vegetation cover class

All.Abund.vc <- plyr::llply(unique(survey.data$VegCoverClass), function(VegCover){
   # Set number of moose to zero if not part of domain A
   survey.data.z <- survey.data
   survey.data.z$NMoose[ survey.data$VegCoverClass != VegCover] <- 0  # set to zero
   #browser()
   All.Abund.vc <- SightabilityPopR(
      survey.data        = survey.data.z      # raw data that has been zeroed out
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
   )
   list(VegCoverClass=VegCover, est=All.Abund.vc)
})

# extract the total abundance

All.Abund.vc.df <- plyr::ldply(All.Abund.vc, function(x){
   #browser()
   res <- data.frame(VegCoverClass=as.character(x$VegCoverClass), stringsAsFactors=FALSE)
   res$estimate  <- x$est[x$est$Stratum==".OVERALL", c("estimate")]
   res$SE        <- x$est[x$est$Stratum==".OVERALL", c("SE"      )]
   res

})
All.Abund.vc.df <- plyr::rbind.fill( All.Abund.vc.df,
                                     data.frame(VegCoverClass=".OVERALL", estimate=sum(All.Abund.vc.df$estimate, stringAsFactors=FALSE)))
All.Abund.vc.df

The estimates sum to the estimate firstly computed, but the estimates from the individual vegetation cover classes are not independent and so the overall standard error cannot be found from the separate estimates by vegetation cover class.

Estimates of ratios can be found in the same way because they are based on abundances.

However, estimates of density cannot be found using this method. Refer to Cochran (1977, Section 5A.14, Estimating Domain Means) for more details.

References

Cochran, W.G. (1977). Sampling techniques. 3rd Edition. Wiley. New York.

Fieberg, J. (2012). Estimating Population Abundance Using Sightability Models: R SightabilityModel Package. Journal of Statistical Software, 51(9), 1-20. URL https://doi.org/10.18637/jss.v051.i09.

Gasaway, W.C., S.D. Dubois, D.J. Reed and S.J. Harbo. (1986). Estimating moose population parameters from aerial surveys. Biol. Pap. Univ. Alaska, No. 22. Institute of Arctic Biology, U. of Alaska, Fairbanks. 108 pp.

Lumley, T. (2019) "survey: analysis of complex survey samples". R package version 3.35-1.

Quayle, J. F., A. G. MacHutchon, and D. N. Jury. (2001). Modeling moose sightability in south-central British Columbia. Alces 37:43–54.

R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

Resource Inventory Standards Committee (RISC). (2002). Aerial-based inventory methods for selected ungulates bison, mountain goat, mountain sheep, moose, elk, deer, and caribou. Version 2. Standards for Components of British Columbia’s Biodiversity No 32. British Columbia Ministry of Sustainable Resource Management, Victoria, B.C.

Reed, D.J. (1989). MOOSEPOP program documentation and instructions. Alaska Department of Fish and Game. Unpublished report.

Steinhorst, K. R. and Samuel, M. D. (1989). Sightability Adjustment Methods for Aerial Surveys of Wildlife Populations. Biometrics 45:415--425.

Unsworth JW, A LF, O GE, Leptich DJ, Zager P (1999). Aerial Survey: User’s Manual. Idaho Department of Fish and Game, electronic edition edition.

Wong, C. (1996). Population size estimation using the modified Horvitz-Thompson estimator with estimated sighting probabilities. Dissertation, Colorado State University, Fort Collins, USA.

Any scripts or data that you put into this service are public.

SightabilityModel documentation built on Aug. 20, 2023, 1:08 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

SightabilityModel Wildlife Sightability Modeling

c-MoosePopR and SightabilityPopR Demonstration In SightabilityModel: Wildlife Sightability Modeling

Availability of code

Introduction

Survey protocol

Data input

Moose data

Block area

Stratum information

Sightability model.

Extracting from AerialSurvey

Reading coefficients of the model

Creation of VegCoverClass variable

Key variables.

Dealing with missing and zero values

Group information

Covariate information

Blocks with no observed groups.

Block information

A single block in a stratum

Stratum information

Analysis

Analysis not adjusting for sightability

Within a stratum

Density

Abundance

Bull to Cow and other ratios

Across all strata

Density

Abundance

Bull to Cow and other ratios

The MoosePopR() function.

Sample Analyses

Density of all moose

Abundance of all moose

Number of bulls and number of cows

Bull to Cow and other ratios

Final object

Adjusting for sightability

Within a stratum

Abundance

Density

Bull to Cow and other ratios

Across all strata

Abundance

Density

Bull to Cow and other ratios

The SightabilityPopR() function

Sample Analyses

Dealing with missing values in the sightability covariates

Abundance of all moose

Density of all moose

Number of bulls and number of cows

Bull to Cow and other ratios

Final object

Domain estimation

Domain at PSU (block) level; domain population information known

Example

Domain at PSU (block) level; domain population information unknown.

Example

Domain at group level; domain population information known

Domain at group level; domain population information unknown.

Example

References

Try the SightabilityModel package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SightabilityModel
Wildlife Sightability Modeling

c-MoosePopR and SightabilityPopR Demonstration
In SightabilityModel: Wildlife Sightability Modeling