acmeR Vignette

#library(devtools)
#load_all(pkg = '/Users/Jake/Documents/Summer_2015/acme/acmeR')
#library(knitr)
#opts_knit$set(root.dir = '/Users/Jake/Documents/Summer_2015',collapse = TRUE, comment = "#>")
library(acmeR)

Introduction

The purpose of acmeR is to provide implementation of the Avian and Chiropteran Mortality Estimator (ACME) described in Wolpert [-@wolpert2015]. The problem of finding the true count of avian and chiropteran deaths at wind turbine sites has been of interest for the last few decades, and several methods exist for estimating the true mortality based on the counted carcasses. The R package carcass in particular provides a wide variety of approaches. However, none of the existing methods or packages use a Weibull distribution to capture the scavenger removal probability nor a decreasing detection probability, both shown in Wolpert [-@wolpert2015] to best capture the behavoir of the data. This package currently contains three functions: acme.summary(), which reads in the data and provide a complete summary of the components of ACME; acme.post(), which finds the posterior probability of the mortality given the observed counts; and acme.table(), which, given a list of observed counts and the same parameters from acme.post(), creates a table of posterior summaries.

Data used with acmeR

The input data file that acmeR requires is a chronological event list from an Integrated Detection Trial (Warren-Hicks et al. 2012, Chapter 2) studying the rates of removal (by scavengers) and detection (in periodic searches by Field Technicians, or FTs) of carcasses that have been systematically placed (by Project Field Manager, or PFMs). The input data should be in the form of a comma-separated variables (CSV) file with (at least) the six fields:

1) Date, in US format mm/dd/yyyy or ISO 8601 format yyyy-mm-dd 2) Time, in am/pm US format HH:MM:SS AM or 24-hr ISO format HH:MM:SS 3) ID, arbitrary distinct alpha strings unique to each carcas 4) Species, arbitrary distinct alpha strings (e.g. AOU, ABMP, IBP) 5) Event, "Place", "Check", or "Search" (only 1st letter counts) 6) Found, TRUE or FALSE (only 1st letter counts)

Such a file can be generated by from a spreadsheet with the "Save as" function with "Save as type" set to "CSV (Comma delimited) (*.csv)". A sample file "altamont.csv" is included with the acmeR distribution.

The "Time" field is local time-zone and is optional. If missing, all Placement events, FT Searches, and PFM Checks are taken to have occured at 08:00:00, 12:00:00, and 16:00:00, respectively.

The "Species" field can be an arbitrary string, distinct for each species; if AOU bird codes or ABMP mammal codes are used, the software will interpret these for more readable output. If missing, "MISC" is used.

The "Event" field specifies which of three different kinds of events this entry represents: "Place" (carcass placement by a PFM), "Check" (by a PFM, who is aware of where and when carcasses were placed), and "Search" (by a Field Technician who is not). Only 1st letter of "Event" (P,C, or S) is significant.

The event "Found" indicates whether or not the carcass was discovered in this event. Possible values are TRUE and FALSE; only the initial letter (T,F) is significant.

These six fields must have the indicated names, with the indicated capitalization. They may be in any order, and any number of other fields may appear in the file (they will be ignored).

To illustrate, the first few lines of the sample "altamont.csv" file are:

"Date","Time","ID","Species","Event","Found"

"1/7/2011","08:00:00 PM","T091","UNBA","Place",TRUE

"1/8/2011","12:00:00 PM","T091","UNBA","Check",TRUE

"1/8/2011","16:00:00 PM","T091","UNBA","Search",FALSE

Using acmeR

acme.summary

This function provides all the information that most users will require. acme.summary() has only one required parameter - fname, the data described in the previous section. fname can be either a character string file path that points to a CSV file, or a data frame. If fname is not specified, this function will throw an error and then create an example dataset titled 'altamont.csv' in the working directory.

acme.summary()
## Error: Missing csv file 'fname'.  Example 'altamont.csv' created in
##      working directory.  Try acme.summary('altamont.csv').

Running acme.summary() with no file name specified also allows a user to examine an example dataset to get a better understand of the data required. (Note, the same data is attached with the library and can be accessed as a data frame by data(altamont), but some users might find the CSV format more convenient).

The following parameters are optional:

The output from acme.summary() is two-fold - the details of the model printed to the console, and the model estimates returned as a list. For most users, the information printed to the cosole will be sufficient. However, those intent on finding posterior mortality probability through acme.post() or acme.table() will find use for the returned values.

For example, suppose we are using the Altamont dataset. To use the csv that we generated above (which should be in the user's working directory), we run the following line:

acme.val <- acme.summary(fname = 'altamont.csv',spec=c("BHCO", "HOWR"))

acme.summary() can also be run with a data frame rather than a csv. The following line of code will run the function on a data frame version of the csv we created earlier, which (again) can be accessed by data(altamont). (Note, CSV files given by a character string and data frames given by R objects are currently the only two classes for fname that are accepted.)

acme.val <- acme.summary(fname=altamont,spec=c("BHCO","HOWR"))
acme.val

As is shown the output to the console is quite extensive. The top gives a summary of the data, with our subset species listed (here we see Altamont includes no House Wren carcasses) as well as useful information such as search and check intervals. The middle paragraphs describe the model and MLE parameters, while the bottom of the console output displays the ACME inverse-inflation factor R*. In this case, R* is r round(acme.val$Rstar,4), so the ACME estimate for mortality given the number of carcasses C would be C/r round(acme.val$Rstar,4).

The other output is the list that the function returns. The components of the list are a 5-element numeric vector titled params that contains the MLE parameters calculated to run the model, a numeric titled Rstar containing the same as R* printed to the console, a numeric titled T containing the first component of R* as calculated in equation 7b in Wolpert [-@wolpert2015], and a numeric titled I containing the mean mean interval time between searches. These values will be useful when running acme.post() or acme.table().

acme.post

acme.post() allows the user to plot the posterior distribution of true mortality M given observed counts C. The model specifies the mortality as being drawn from a Poisson distribution with mean $mI$,where $I$ is the interval length and m is a parameter with prior distribution of Gamma($\xi$, $\lambda$). Appendix A2 of Wolpert [-@wolpert2015] gives the full model description, and calculates the posterior of M given C to be a scaled Gauss' hypergeometric function.

acme.post() has 10 parameters, with defaults relating to the example from the paper (and thus this Vignette). The parameters are as follows:

#Carcass count of 5, with output from acme.summary as parameters
acme.post(C=5, Rstar=acme.val$Rstar, T=acme.val$T, I = acme.val$I, 
        xi = 1/2, lam = 0)
res <-acme.post(C=5, Rstar=acme.val$Rstar, T=acme.val$T, I = acme.val$I, 
        xi = 1/2, lam = 0, plotit=FALSE)

Here we see up to 38 posterior probability values for the mortality count M given the observed count C. In this case the observed fatality count is 5, the ACME estimator ($\hat{M}$) is r round(res[1,"M_hat"],2), and the posterior mean ($\overline{M}$) is r round(res[1,"Post_Mean"],2). This plot makes explicit that while the model may estimate close to 20 mortalities, there are a wide range of true mortalities with non-negligable posterior probabilities. Even if the count is 5, there is a small, nonzero chance that the true number of mortalities in this search is fewer than 5 (i.e. that bleed-through occurred).

The 90\% hpd credible interval is covered by M=8 to M=34 (as shown by blue circles), and we see this range actually covers 90.57\% of the density (because the distribution is discrete, exact coverage probabilities cannot be obtained, and hpd credible intervals will always cover more than the exact specified amount). The red squres show the 50\% hpd credible interval from 13 to 24 in this case.

#Carcass count at 3, different hpd credible intervals values
acme.post(C=3,Rstar=acme.val$Rstar, T=acme.val$T, I = acme.val$I, 
        xi = 1/2, lam = 0, gam = c(.9,.95))

This graph depicts a different observed count (here C = 3), and a different set of values for hpd credible intervals (here we have 0.9 and 0.95 rather than the default 0.5 and 0.9).

acme.table

If the user is interested in summary statistics for multiple observed counts, but does not need plots for each one, acme.table() is valuable. This function serves as a wrapper for acme.post() with plotit=FALSE, while concatinating different values of C (acme.post() can only be run with a single value for C). The function outputs a matrix with a row for each observed count, along with ACME estimate, posterior mean, and specified highest posterior density credible intervals. With the exception of the counts (and lack of plotit parameter), the parameters of acme.table() are identical to those of acme.post().

acme.table(C=0:5, Rstar=acme.val$Rstar, T=acme.val$T, I = acme.val$I, 
        xi = 1/2, lam = 0, gam=c(.5,.9))

References



Try the acmeR package in your browser

Any scripts or data that you put into this service are public.

acmeR documentation built on May 2, 2019, 9:24 a.m.