dfuncEstim: dfuncEstim - Estimate a distance-based detection function
In tmcd82070/Rdistance: Density and Abundance from Distance-Sampling Surveys

dfuncEstim

R Documentation

dfuncEstim - Estimate a distance-based detection function

Description

Fits a detection function using maximum likelihood.

Usage

dfuncEstim(data, ...)

Arguments

data

An RdistDf data frame. RdistDf data frames contain one line per transect and a list-based column. The list-based column contains a data frame with detection information. The detection information data frame on each row contains (at least) distances and group sizes of all targets detected on the transect. Function RdistDf creates RdistDf data frames from separate transect and detection data frames. is.RdistDf checks whether data frames are RdistDf's.

...

Arguments passed on to dE.single, dE.multi

formula: A standard formula object. For example, dist ~ 1, dist ~ covar1 + covar2). The left-hand side (before ~) is the name of the vector containing off-transect or radial detection distances. The right-hand side contains the names of covariate vectors to fit in the detection function, and potentially group sizes. Covariates can be either detection level or transect level and can appear in data or exist in the global working environment. Regular R scoping rules apply.
likelihood: String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp".
w.lo: Lower or left-truncation limit of the distances in distance data. This is the minimum possible off-transect distance. Default is 0. If w.lo is greater than 0, it must be assigned measurement units using units(w.lo) <- "<units>" or w.lo <- units::set_units(w.lo, "<units>"). See examples in the help for set_units.
w.hi: Upper or right-truncation limit of the distances in dist. This is the maximum off-transect distance that could be observed. If unspecified (i.e., NULL), right-truncation is set to the maximum of the observed distances. If w.hi is specified, it must have associated measurement units. Assign measurement units using units(w.hi) <- "<units>" or w.hi <- units::set_units(w.hi, "<units>"). See examples in the help for set_units.
expansions: A scalar specifying the number of terms in series to compute. Depending on the series, this could be 0 through 5. The default of 0 equates to no expansion terms of any type. No expansion terms are allowed (i.e., expansions is forced to 0) if covariates are present in the detection function (i.e., right-hand side of formula includes something other than 1).
series: If expansions > 0, this string specifies the type of expansion to use. Valid values at present are 'simple', 'hermite', and 'cosine'.
x.scl: The x coordinate (a distance) at which the detection function will be scaled. g.x.scl can be a distance or the string "max". When x.scl is specified (i.e., not 0 or "max"), it must have measurement units assigned using either library(units);units(x.scl) <- '<units>' or x.scl <- units::set_units(x.scl, <units>). See units::valid_udunits() for valid symbolic units.
g.x.scl: Height of the distance function at coordinate x. The distance function will be scaled so that g(x.scl) = g.x.scl. If g.x.scl is not a data frame, it must be a numeric value (vector of length 1) between 0 and 1.
warn: A logical scalar specifying whether to issue an R warning if the estimation did not converge or if one or more parameter estimates are at their boundaries. For estimation, warn should generally be left at its default value of TRUE. When computing bootstrap confidence intervals, setting warn = FALSE turns off annoying warnings when an iteration does not converge. Regardless of warn, after completion all messages about convergence and boundary conditions are printed by print.dfunc, print.abund, and plot.dfunc.
outputUnits: A string specifying the symbolic measurement units for results. Valid units are listed in units::valid_udunits(). The strings for common distance symbolic units are: "m" - meters, "ft" - feet, "cm" - centimeters, "mm" - millimeters, "mi" - miles, "nmile" - nautical miles ("nm" is nano meters), "in" - inches, "yd" - yards, "km" - kilometers, "fathom" - fathoms, "chains" - chains, and "furlong" - furlongs. If outputUnits is unspecified (NULL), output units will be the same as those on distances in data.

Details

Optimization and estimation controls can be modified using options(). See RdistanceControls.

Value

An object of class 'dfunc'. Objects of class 'dfunc' are lists containing the following components:

`par`	The vector of estimated parameter values. Length of this vector for built-in likelihoods is one (for the function's parameter) plus the number of expansion terms plus one if the likelihood is 'hazrate' (which has two parameters).
`varcovar`	The variance-covariance matrix for coefficients of the distance function, estimated by the inverse of the fit's Hessian evaluated at the estimates. Rdistance estimates the Hessian as the second derivative of the log likelihood surface at the final estimates, where second derivatives are estimated by numeric differentiation (see `secondDeriv`. There is no guarantee this matrix is positive-definite and should be viewed with caution. Error estimates derived from bootstrapping are generally more reliable. I.e., re-compute coefficient confidence intervals using the bootstrap values in component `$B` of an abundance object.
`loglik`	The maximized value of the log likelihood.
`convergence`	The convergence code. This code is returned by `optim` or `nlminb`. Values other than 0 indicate suspect convergence.
`likelihood`	The name of the likelihood. This is the value of the argument `likelihood`.
`w.lo`	Left-truncation value used during the fit.
`w.hi`	Right-truncation value used during the fit.
`mf`	A modelframe of detections within the strip or circle used in the fit. Column 'dist' contains the observed distances. Column 'offset(...)' contains group sizes associated with the values of 'dist'. Group sizes are only used in `abundEstim`. This model frame contains only non-missing distances between `w.lo` and `w.hi`.
`model.frame`	A `model.frame` object containing observed distances (the 'response'), covariates specified in the formula, and group sizes if they were specified. If specified, the name of the group size column is "offset(-variable-)", not "groupsize(-variable-)", because internally it is easier to treat group sizes as an offset in the model. This component is a proper `model.frame` and contains both 'terms' and 'contrasts' attributes.
`siteID.cols`	A vector containing the transect ID column names in `detectionData` and `siteData`. Transect IDs can be a composite of two or more columns and hence this component can have length greater than 1.
`expansions`	The number of expansion terms used during estimation.
`series`	The type of expansion used during estimation.
`call`	The original call of this function.
`call.x.scl`	The input or user requested distance at which the distance function is scaled.
`call.g.x.scl`	The `input` value specifying the height of the distance function at a distance of `call.x.scl`.
`call.observer`	The value of input parameter `observer`. The input `observer` parameter is only applicable when `g.x.scl` is a data frame.
`fit`	The fitted object returned by `optim`. See documentation for `optim`.
`factor.names`	The names of any factors in `formula`.
`pointSurvey`	The input value of `pointSurvey`. This is TRUE if distances are radial from a point. FALSE if distances are perpendicular off-transect.
`formula`	The formula specified for the detection function.
`control`	A list containing values of the 'control' parameters set by `RdistanceControls`.
`outputUnits`	The measurement units used for output. All distance measurements are converted to these units internally.
`x.scl`	The actual distance at which the distance function is scaled to some value. i.e., this is the actual x at which g(x) = `g.x.scl`. Note that `call.x.scl` = `x.scl` unless `call.x.scl` == "max", in which case `x.scl` is the distance at which g() is maximized.
`g.x.scl`	The actual height of the distance function at a distance of `x.scl`. Note that `g.x.scl` = `call.g.x.scl` unless `call.g.x.scl` is a multiple observer data frame, in which case `g.x.scl` is the actual height of the distance function at `x.scl` computed from the multiple observer data frame.

Group Sizes

To specify non-unity group sizes, use groupsize() on the RHS of formula. When group sizes are not all 1, they must appear in a column of the 'detections' list-column of data. For example, d ~ habitat + groupsize(number) specifies distances in column d, one covariate named habitat, and that column number contains the number of individuals associated with each detection. If group sizes are not specified, all group sizes are assumed to be 1.

Contrasts

Factor contrasts in Rdistance are specified the same way as in lm or glm. By default, Rdistance uses contrasts in getOption("contrasts"). To change contrasts, use a statement like options(contrasts = c(unordered = "contr.SAS", ordered = "contr.poly")). Or, to set contrasts for a specific factor in the input data frame, use contrasts(df$A) <- "contr.sum" or similar. See contrasts or the contrasts.arg of model.matrix.

Measurement Units

As of Rdistance version 3.0.0, measurement units are require on all physical distances. Requiring units ensures that internal calculations and results (e.g., ESW and abundance) are correct and that output units are clear. Physical distances are required on off-transect distances, radial distances, truncation distances (w.lo, unless it is zero; and w.hi, unless it is NULL), scale locations (x.scl, unless it is zero), line-transect lengths, and study area size. All units are 1-dimensional except those on study area, which are 2-dimensional.

Physical measurement units can vary. For example, off-transect distances can be meters ("m"), w.hi can be inches ("in"), and w.lo can be kilometers ("km"). Internally, all distances are converted to the units specified by outputUnits (or the units of input distances if outputUnits is NULL), and all output is reported in units of outputUnits. Valid conversions must exist between units or an error is thrown. For example, meters cannot be converted into hectares.

Measurement units can be assigned using units()<- after attaching the units package or with x <- units::set_units(x, "<units>"). See units::valid_udunits() for a list of valid symbolic units.

If measurements are truly unit-less, or measurement units are unknown, set options(Rdist_requireUnits = FALSE). This suppresses all unit checks and conversions. Users are on their own to make sure inputs are scaled correctly and that output units are known.

References

Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.

Examples

# Sparrow line transect example
data(sparrowDetectionData)
data(sparrowSiteData)

sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData)

dfunc <- dfuncEstim(sparrowDf, 
                    formula = dist ~ 1
                  )
summary(dfunc)

  
data(sparrowDfuncObserver) # pre-estimated object
## Not run:                  
# Command to produce 'sparrowDfuncObserver'
sparrowDfuncObserver <- sparrowDf |> 
         dfuncEstim( 
           formula = dist ~ observer
         )

## End(Not run)     
sparrowDfuncObserver
summary(sparrowDfuncObserver)
plot(sparrowDfuncObserver)

tmcd82070/Rdistance documentation built on April 13, 2025, 1:38 p.m.