dfuncEstim  R Documentation 
Fit a specific detection function to offtransect or offpoint (radial) distances using maximum likelihood. Distance functions are fitted to individual distance observations, not histogram bin heights, despite plot methods that draw histogram bars.
dfuncEstim(
formula,
detectionData,
siteData,
likelihood = "halfnorm",
pointSurvey = FALSE,
w.lo = units::set_units(0, "m"),
w.hi = NULL,
expansions = 0,
series = "cosine",
x.scl = units::set_units(0, "m"),
g.x.scl = 1,
observer = "both",
warn = TRUE,
transectID = NULL,
pointID = "point",
outputUnits = NULL,
control = RdistanceControls()
)
formula 
A standard formula object (e.g., Group Sizes: Nonunity group sizes are specified using 
detectionData 
A data frame containing detection distances (either perpendicular for linetransect or radial for pointtransect designs), with one row per detected object or group. This data frame must contain at least the following information:
See example data set 
siteData 
A data.frame containing site (transect or point)
IDs and any
site level covariates to include in the detection function.
Every unique surveyed site (transect or point) is represented on
one row of this data set, whether or not targets were sighted
at the site. See arguments See Data frame requirements for situations in which

likelihood 
String specifying the likelihood to fit. Builtin likelihoods at present are "uniform", "halfnorm", "hazrate", "negexp", and "Gamma". See vignette for a way to use userdefine likelihoods. 
pointSurvey 
A logical scalar specifying whether input data come from pointtransect surveys (TRUE), or linetransect surveys (FALSE). 
w.lo 
Lower or lefttruncation limit of the distances in distance data.
This is the minimum possible offtransect distance. Default is 0. If

w.hi 
Upper or righttruncation limit of the distances
in 
expansions 
A scalar specifying the number of terms
in 
series 
If 
x.scl 
The x coordinate (a distance) at which to scale the
sightability function to 
g.x.scl 
Height of the distance function at coordinate x.
The distance function
will be scaled so that g( 
observer 
A numeric scalar or text string specifying whether observer 1
or observer 2 or both were fulltime observers.
This parameter dictates which set of observations form the denominator
of a double observer system.
If, for example, observer 2 was a data recorder and parttime observer,
or if observer 2 was the pilot, set 
warn 
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, 
transectID 
A character vector naming the transect ID column(s) in

pointID 
When pointtransects are used, this is the
ID of points on a transect. When If single points are surveyed,
meaning surveyed points were not grouped into transects, each 'transect' consists
of one point. In this case, set 
outputUnits 
A string giving the symbolic measurment
units that results should be reported in. Any
distance measurement unit in 
control 
A list containing optimization control parameters such
as the maximum number of iterations, tolerance, the optimizer to use,
etc. See the

An object of class 'dfunc'. Objects of class 'dfunc' are lists containing the following components:
parameters 
The vector of estimated parameter values. Length of this vector for builtin likelihoods is one (for the function's parameter) plus the number of expansion terms plus one if the likelihood is either 'hazrate' or 'uniform' (hazrate and uniform have two parameters). 
varcovar 
The variancecovariance matrix for coefficients of the distance function, estimated by the inverse of the Hessian of the fit evaluated at the estimates. There is no guarantee this matrix is positivedefinite and should be viewed with caution. Error estimates derived from bootstrapping are generally more reliable. 
loglik 
The maximized value of the log likelihood (more specifically, the minimized value of the negative log likelihood). 
convergence 
The convergence code. This code
is returned by 
like.form 
The name of the likelihood. This is
the value of the argument 
w.lo 
Lefttruncation value used during the fit. 
w.hi 
Righttruncation value used during the fit. 
detections 
A data frame of detections within the strip
or circle used in the fit. Column 'dist' contains the
observed distances.
Column 'groupSize' contains group sizes associated with
the values of 'dist'. Group
sizes are only used in 
covars 
Either NULL if no covariates are included in the
detection function, or a 
model.frame 
A 
siteID.cols 
A vector containing the transect ID column names in 
expansions 
The number of expansion terms used during estimation. 
series 
The type of expansion used during estimation. 
call 
The original call of this function. 
call.x.scl 
The input or user requested distance at which the distance function is scaled. 
call.g.x.scl 
The 
call.observer 
The value of input parameter 
fit 
The fitted object returned by 
factor.names 
The names of any factors in 
pointSurvey 
The input value of 
formula 
The formula specified for the detection function. 
control 
A list containing values of the 'control' parameters
set by 
outputUnits 
The measurement units used for output. All distance measurements are converted to these units internally. 
x.scl 
The actual distance at which
the distance function is scaled to some value.
i.e., this is the actual x at
which g(x) = 
g.x.scl 
The actual height of the distance function
at a distance of 
Rdistance
accommodates two kinds of transects: continuous and point.
On continuous transects detections can occur at
any point along the route, and these are linetransects.
On point transects detections can only
occur at a series of stops (points), and these are
pointtransects.
Transects are the basic sampling unit in both cases.
Columns named in transectID
are
sufficient to specify unique linetransects.
The combination of transectID
and
pointID
specify unique sampling locations along pointtransects.
See Input data frames below for more detail.
To save space and to easily specify
sites without detections,
all site ID's, regardless of whether a detection occurred there,
and site level covariates are stored in
the siteData
data frame. Detection distances and group
sizes are measured at the detection level and
are stored in the
detectionData
data frame.
The following explains conditions under which various combinations of the input data frames are required.
Detection data and site data both required:
Both detectionData
and siteData
are required if site level covariates are
specified on the righthand side of formula
.
Detection level covariates are not currently allowed.
Both detectionData
and
siteData
data frames are required to estimate abundance
later in abundEstim
.
Detection data only required:
detectionData
only is required when
covariates are are not included in the distance function (i.e., the righthand side of
formula
is "~1" or "~groupsize(groupSize)"). Note that dfuncEstim
does not need to know transect IDs (or group sizes)
in order to estimate a distance function; but, group sizes and
transect IDs are stored and used to estimate abundance
in function abundEstim
. Both the detectionData
and
siteData
data frames are required in abundEstim
.
Neither detection data nor site data required
Neither detectionData
nor siteData
are required if all variables specified in formula
are within the scope of dfuncEstim
(e.g., in the global working
environment) and abundance estimates are not required.
Regular R scoping rules apply when the call
to dfuncEstim
is embedded in a function.
This case is will produce distance functions only.
Abundance cannot later be estimated because transects and transect lengths cannot
be specified outside of a data frame. If abundance will be estimated,
use either case 1 or 2.
The input data frames, detectionData
and siteData
,
must be mergeable on unique sites. For linetransects,
site ID's specify transects or routes and are unique values of
the transectID
column in siteData
. In this case,
the following merge must work:
merge(detectionData,siteData,by=transectID)
.
For pointtransects,
site ID's specify individual points and are unique values
of the combination paste(transectID,pointID)
.
In this case, the following merge must work:
merge(detectionData,siteData,by=c(transectID, pointID)
.
By default, transects are unique combinations of the
common variables in the detectionData
and siteData
data frames
if both data frames are specified (i.e., unique values of
intersect(names(detectionData), names(siteData))
). If siteData
is not specified and transectID
is not given, transects are assumed to
be identified in a column named siteID
in detectionData
.
Either way
(i.e., either transectID
= "siteID" or specified as something else),
the column(s) containing transect ID's must be correct here if abundance is to be
estimated later. Routine abundEstim
requires transect ID's for bootstrapping
because it resamples unique values of the composite transect ID column(s). abundEstim
uses the value of transectID
specified here and hence users cannot change transect ID's between
calls to dfuncEstim
and abundEstim
and all transectID
columns
must be present in both data frames even though they may not be used until later.
An error occurs if both detectionData
and siteData
are specified
but no common columns exist. Duplicate transectID
values are not allowed in siteData
but are allowed in detectionData
because multiple detections can occur on a single transect
or at a single site. If the same site is surveyed in multiple years, specify another level of transect ID;
for example, transectID
= c("year","transectID")
.
As of Rdistance
version 3.0.0, measurement units are
require on all distances. This includes offtransect
distances, radial
distances, truncation distances (w.lo
and w.hi
),
transect lengths, and study size area.
In dfuncEstim
, units are required on the following:
detectionData$dist
; w.lo
(unless it is zero);
w.hi
(unless it is NULL);
and x.scl
. In abundEstim
, units are
required on siteData$length
and area
. All units are
1dimensional except those on area
, which are 2dimensional.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Input distances can have variable units. For example,
input distances can be in specified in "m", w.hi
in "in",
and w.lo
in "km". Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits
is NULL), and
all output is reported
in units of outputUnits
.
Measurement units can be assigned using
units()<
after attaching the units
package or with x < units::set_units(x, "<units>")
.
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unitless, or measurement units are unknown,
set RdistanceControls(requireUnits = FALSE)
. This suppresses
all unit checks and conversions. Users are on their own
to make sure inputs are scaled correctly and that output units are known.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
abundEstim
, autoDistSamp
.
Likelihoodspecific help files (e.g., halfnorm.like
).
See package vignettes for additional options.
# Load example sparrow data (line transect survey type)
data(sparrowDetectionData)
dfunc < dfuncEstim(formula = dist ~ 1
, detectionData = sparrowDetectionData)
dfunc
plot(dfunc)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.