RdistDf | R Documentation |
Makes an Rdistance
data frame from
separate transect and detection
data frames. Rdistance
data frames are nested
data frames with one row per transect. Detection
information for each transect appears in a list-based
column that itself contains a
data frame. See Rdistance Data Frames.
Rdistance
data frames can be constructed using calls to
dplyr::nest_by
and dplyr::right_jion
, with subsequent
attribute assignment (see Examples). This routine is
a convenient wrapper for those calls.
RdistDf(
transectDf,
detectionDf,
by = NULL,
pointSurvey = FALSE,
observer = "single",
.detectionCol = "detections",
.effortCol = NULL
)
transectDf |
A data frame with one row per transect and
columns containing information about the entire transect.
At a minimum, this data frame must contain the transect's ID so
it can be merged with |
detectionDf |
A data frame containing detection information associated with each transect. At a minimum, each row of this data frame must contain the following:
Optional columns in 'detectionDf':
|
by |
A character vector of variables to use in the join. The right-hand
side of this join identifies unique transects (unique
rows) in both |
pointSurvey |
If TRUE, observations were made from discrete points (i.e., during a point-transect survey) and distances are radial from observation point to target. If FALSE, observations were made along a continuous transect (i.e., during a line-transect survey) and distances are from target to nearest point on the transect (i.e., perpendicular to transect). |
observer |
Type of observer system. Legal values are "single" for single observer systems, "1given2" for a double observer system wherein observations made by observer 1 are tested against observations made by observer 2, "2given1" for a double observer system wherein observations made by observer 2 are tested against observations made by observer 1, and "both" for a double observer system wherein observations made by both observers are tested against the other and combined. |
.detectionCol |
Name of the list column that will contain detection data frames. Default name is "detections". Detection distances (LHS of 'dfuncEstim' formula) and group sizes are normally columns in the nested detection data frames embedded in '.detectionCol'. |
.effortCol |
For continuous line transects,
this specifies the name of a column in |
For valid bootstrap estimates of confidence intervals (computed in abundEstim
),
each row of the nested data frame must represent one transect (more generally,
one sampling unit), and none should
be duplicated. The combination of transect columns
in by
(i.e., the LHS of the merge, or "a" and "b" of
by = c("a" = "d", "b" = "c")
for example)
should specify unique transects and unique rows of
transectDf
. Warning: If by
does not specify unique rows of transectDf
, dplyr::left_join
,
which is called internally, will perform a many-to-many merge without
warning, and this normally duplicates both
transects and detections.
A nested tibble (a generalization of base data frames)
with one row per transect, and detection
information in a list column. Technically, the return is
a grouped tibble
from
the tibble
package with one row per group, and a list
column containing detection information.
Survey type, observer system, and name of the effort column
are recorded
as attributes (transType
, obsType
, and effortColumn
, respectfully).
The return prints nicely using methods
in package tibble
. If returned objects print strangely,
attach library tibble
. A summary method tailored to distance sampling
is available (i.e., summary(return)
).
RdistDf
data frames contain the following information:
Transect Information: Each row of the data frame contains transect id and effort. Effort is transect length for line-transects, and number of points for point-transects. Optionally, transect-level covariates (such as habitat or observer id) appear on each row.
Detection Information: Observation distances (either perpendicular off-transect or radial off-point) appear in a data frame stored in a list column. If detected groups occasionally included more than one target, a group size column must be present in the list-column data frame. Optionally, detection-level covariates (such as sex or size) can appear in the data frame of the list column.
Distance Type: The type of observation distances, either
perpendicular off-transect (for line-transects studies) or
radial off-point (for point-transect studies) must appear as an
attribute of RdistDf
's.
Observer Type: The type of observation system used, either
single observer or one of three types of multiple observer systems, must
appear as an attribute of RdistDf
's.
Line-transects are continuous paths with targets detectable at any point. Point transects consist of one or more discrete points along a path from which observers search for targets. The length of a line-transect is it's physical length (e.g., km or miles). The 'length' of a point transect is the number of points along the transect. Single points are considered transects of length one. The length of line-transects must have a physical measurement unit (e.g., 'm' or 'ft'). The length of point-transects must be a unit-less integers (i.e., number of points).
As of Rdistance
version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo
, unless it is zero; and w.hi
, unless it is NULL),
scale locations (x.scl
, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi
can be inches ("in"),
and w.lo
can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits
is NULL), and
all output is reported
in units of outputUnits
. Valid conversions must exist between
units or an error is thrown. For example, meters cannot be converted
into hectares.
Measurement units can be assigned using
units()<-
after attaching the units
package or with x <- units::set_units(x, "<units>")
.
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE)
. This suppresses
all unit checks and conversions. Users are on their own
to make sure inputs are scaled correctly and that output units are known.
is.RdistDf
: check validity of RdistDf data frames;
dfuncEstim
: estimate distance function.
data(sparrowSiteData)
data(sparrowDetectionData)
sparrowDf <- RdistDf( sparrowSiteData, sparrowDetectionData )
is.RdistDf(sparrowDf, verbose = T)
summary(sparrowDf)
summary(sparrowDf
, formula = dist ~ groupsize(groupsize)
, w.hi = units::set_units(100, "m"))
# Equivalent to above:
sparrowDf <- sparrowDetectionData |>
dplyr::nest_by( siteID
, .key = "detections") |>
dplyr::right_join(sparrowSiteData, by = "siteID")
attr(sparrowDf, "detectionColumn") <- "detections"
attr(sparrowDf, "effortColumn") <- "length"
attr(sparrowDf, "obsType") <- "single"
attr(sparrowDf, "transType") <- "line"
is.RdistDf(sparrowDf, verbose = T)
summary(sparrowDf, formula = dist ~ groupsize(groupsize))
# Condensed view: 1 row per transect (make sure tibble is attached)
sparrowDf
# Expansion methods:
# (1) use Rdistance::unnest (includes zero transects)
df1 <- unnest(sparrowDf)
any( df1$siteID == "B2" ) # TRUE
# Use tidyr::unnest(); but, no zero transects
df2 <- tidyr::unnest(sparrowDf, cols = "detections")
any( df2$siteID == "B2" ) # FALSE
# Use dplyr::reframe for specific transects (e.g., for transect "B3")
sparrowDf |>
dplyr::filter(siteID == "B3") |>
dplyr::reframe(detections)
# Count detections per transect (can't use dplyr::if_else)
df3 <- sparrowDf |>
dplyr::reframe(nDetections = ifelse(is.null(detections), 0, nrow(detections)))
sum(df3$nDetections) # Number of detections
sum(df3$nDetections == 0) # Number of zero transects
# Point transects
data(thrasherDetectionData)
data(thrasherSiteData)
thrasherDf <- RdistDf( thrasherSiteData
, thrasherDetectionData
, pointSurvey = TRUE
, by = "siteID"
, .detectionCol = "detections")
summary(thrasherDf, formula = dist ~ groupsize(groupsize))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.