predict.dfunc: predict.dfunc - Predict distance functions

View source: R/predict.dfunc.R

predict.dfuncR Documentation

predict.dfunc - Predict distance functions

Description

Predict either likelihood parameters, distance functions, site-specific density, or site-specific abundance from estimated distance function objects.

Usage

## S3 method for class 'dfunc'
predict(
  object,
  newdata = NULL,
  type = c("parameters"),
  distances = NULL,
  propUnitSurveyed = 1,
  area = NULL,
  ...
)

Arguments

object

An Rdistance model frame or fitted distance function, normally produced by a call to dfuncEstim.

newdata

A data frame containing new values of the covariates at which to evaluate the distance functions. If newdata is NULL, distance functions are evaluated at values of the observed covariates and results in one prediction per distance or transect (see parameter type). If newdata is not NULL and the model does not contains covariates, this routine returns one prediction for each row in newdata, but columns and values in newdata are ignored.

type

The type of predictions desired.

  • If type == "parameters": Returned value is a matrix of predicted (canonical) parameters of the likelihood function. If newdata is NULL, return contains one parameter value for every detection distance in object$mf (distances in object$mf are between object$w.lo and object$w.hi and non-missing). If newdata is not NULL, returned vector has one parameter for every row in newdata. Parameter distances is ignored when type == "parameters". Canonical parameters (non-expansion terms) are returned on the response (inverse-link) scale. Raw canonical parameters in object$par are stored in the link scale. Expansion term parameters use the identity link, so their value in the output equals their value in object$par.

  • If type == "likelihood": Returned value is a matrix of unscaled likelihood values for all observed distances in object$mf, i.e., raw distance functions evaluated at the observed distances. Parameters newdata and distances are ignored when type is "likelihood". The negative log likelihood of the full data set is -sum(log(predict(object,type="likelihood") / effectiveDistance(object))).

  • If type == "dfuncs" or "dfunc": Returned value is a matrix whose columns contain scaled distance functions. The distance functions in each column are evaluated at distances in argument distances, not at the observed distances in object$mf. The number of distance functions returned (i.e., number of columns) depends on newdata as follows:

    • If newdata is NULL, one distance function will be returned for every detection in object$mf that has valid covariate values.

    • If newdata is not NULL, one distance function will be returned for each observation (row) in newdata.

  • If type == "density" or "abundance": Returned object is a tibble containing predicted density and abundance on the area surveyed by each transect.

If object is a smoothed distance function, it does not have parameters and this routine will only return scaled distance functions, densities, or abundances. That is, type = "parameters" when object is smoothed does not make sense and the smoothed distance function estimate will be returned if type does not equal "density" or "abundance".

distances

A vector or 1-column matrix of distances at which to evaluate distance functions, when distance functions are requested. distances must have measurement units. Any distances outside the observation strip (object$w.lo to object$w.hi) are discarded. If distances is NULL, a sequence of getOption("Rdistance_intEvalPts") (default 101) evenly spaced distances between object$w.lo and object$w.hi (inclusive) is used.

propUnitSurveyed

A scalar or vector of real numbers between 0 and 1. The proportion of the default sampling unit that was surveyed. If both sides of line transects were observed, propUnitSurveyed = 1. If only a single side of line transects were observed, set propUnitSurveyed = 0.5. For point transects, this should be set to the proportion of each circle that was observed. Length must either be 1 or the total number of transects in x.

area

A scalar containing the total area of inference. Usually, this is study area size. If area is NULL (the default), area will be set to 1 square unit of the output units and density estimates will be produced. If area is not NULL, it must have measurement units assigned by the units package. The units on area must be convertible to squared output units. Units on area must be two-dimensional. For example, if output units are "foo", units on area must be convertible to "foo^2" by the units package. Units of "km^2", "cm^2", "ha", "m^2", "acre", "mi^2", and several others are acceptable.

...

Included for compatibility with generic predict methods.

Value

A matrix containing predictions:

  • If type is "parameters", the returned matrix contains likelihood parameters. The extent of the first dimension (rows) in the returned matrix is equal to either the number of detection distances in the observed strip or number of rows in newdata. The returned matrix's second dimension (columns) is the number of parameters in the likelihood plus the number of expansion terms. See the help for each likelihoods to interpret returned parameter values. All parameters are returned on the inverse-link scale; i.e., exponential for canonical parameters and identity for expansion terms.

  • If type is "dfuncs" or "dfunc", columns of the returned matrix contains detection functions (i.e., g(x)). The extent of the first dimension (number of rows) is either the number of distances specified in distances or options()$Rdistance_intEvalPts if distances is not specified. The extent of the second dimension (number of columns) is:

    • the number of detections with non-missing distances: if newdata is NULL.

    • the number of rows in newdata if newdata is specified.

    All distance functions in columns of the return are scaled to object$g.x.scale at object$x.scl. The returned matrix has the following additional attributes:

    • attr(return, "distances") is the vector of distances used to predict the function in return. Either the input distances object or the computed sequence of distances when distances is NULL.

    • attr(return, "x0") is the vector of distances at which each distance function in return was scaled. i.e., the vector of x.scl.

    • attr(return, "g.x.scl") is the height of g(x) (the distance function) at x0.

  • If type is "density" or "abundance", the return is a tibble containing density and abundance estimates by transect. All transects in the input data (i.e., object$data) are included, even those with missing lengths. Columns in the tibble are:

    • transect ID: the grouping factor of the original RdistDf object.

    • individualsSeen: sum of non-missing group sizes on that transect.

    • avgPdetect: average probability of detection over groups sighted on that transect.

    • effort: size of the area surveyed by that transect.

    • density: density of individuals in the area surveyed by the transect.

    • abundance: abundance of individuals in the area surveyed by the transect.

See Also

halfnorm.like, negexp.like, hazrate.like

Examples


data("sparrowDf")

# For dimension checks:
nd <- getOption("Rdistance_intEvalPts")

# No covariates
dfuncObs <- sparrowDf |> dfuncEstim(formula = dist ~ 1
                     , w.hi = units::as_units(100, "m"))
                     
n  <- nrow(dfuncObs$mf)
p <- predict(dfuncObs) # parameters
all(dim(p) == c(n, 1)) 

# values in newdata ignored because no covariates
p <- predict(dfuncObs, newdata = data.frame(x = 1:5))
all(dim(p) == c(5, 1)) 

# Distance functions in columns, one per observation
p <- predict(dfuncObs, type = "dfunc") 
all(dim(p) == c(nd, n))

d <- units::set_units(c(0, 20, 40), "ft")
p <- predict(dfuncObs, distances = d, type = "dfunc") 
all(dim(p) == c(3, n))

p <- predict(dfuncObs
   , newdata = data.frame(x = 1:5)
   , distances = d
   , type = "dfunc") 
all(dim(p) == c(3, 5))

# Covariates
data(sparrowDfuncObserver) # pre-estimated object
## Not run: 
# Command to generate 'sparrowDfuncObserver'
sparrowDfuncObserver <- sparrowDf |> 
            dfuncEstim(formula = dist ~ observer
                     , likelihood = "hazrate")

## End(Not run)

predict(sparrowDfuncObserver)  # n X 2

Observers <- data.frame(observer = levels(sparrowDf$observer))
predict(sparrowDfuncObserver, newdata = Observers) # 5 X 2

predict(sparrowDfuncObserver, type = "dfunc") # nd X n
predict(sparrowDfuncObserver, newdata = Observers, type = "dfunc") # nd X 5
d <- units::set_units(c(0, 150, 400), "ft")
predict(sparrowDfuncObserver
  , newdata = Observers
  , distances = d
  , type = "dfunc") # 3 X 5

# Density and abundance by transect
predict(sparrowDfuncObserver
  , type = "density")
  

Rdistance documentation built on April 12, 2025, 1:12 a.m.