mcds.wrap: Fit the distance sampling model provided by the Distance...

Description Usage Arguments Details Value Author References Examples

View source: R/mcds.wrap.R

Description

This function fits the distance sampling model provided by the Distance program using its MCDS engine.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mcds.wrap(dataset, path, pathMCDS, SMP_LABEL = "SMP_LABEL",
  SMP_EFFORT = "SMP_EFFORT", DISTANCE = "DISTANCE", SIZE = "SIZE",
  STR_LABEL = "STR_LABEL", STR_AREA = "STR_AREA", Type = c("Line",
  "Point", "Cue"), units = list(Distance = "Perp", Length_units =
  "Kilometer", Distance_units = "Meter", Area_units = "Square kilometer"),
  breaks = c(0, 50, 100, 200, 300), covariates = NULL, factor = NULL,
  lsub = NULL, stratum = NULL, split = TRUE, rare = NULL,
  period = NULL, detection = c("All", "Stratum"),
  monotone = c("Strict", "None", "Weak"), estimator = NULL,
  multiplier = NULL, empty = NULL, verbose = FALSE)

Arguments

dataset

A data.frame with observations.

path

The path where to store the input and output files of the MCDS engine.

pathMCDS

The path where the MCDS engine is located.

SMP_LABEL

Name of the column to use for the transect/watch label.

SMP_EFFORT

Length in of the transect or the transect/watch unit.

DISTANCE

Distance of the observation.

SIZE

Number of individuals in the observation.

STR_LABEL

Name of the column to use as a stratum label.

STR_AREA

Name of the column to use for the stratum area.

Type

Name of the type of transects ("Line", "Point", or "Cue"). Default value is "Line".

units

List of the units used for the analysis. Contains the Distance engine to use ("Perp" or "Radial"). depending on the type of transects, the Length units, the Distance units, and the Area_units. For the possible units of distance and area see Distance 7.0 documentation.

breaks

A vector giving the distance intervals in meters to be used in the analysis.

covariates

A vector giving the name of covariates.

factor

A vector giving the name of factors to be used in the analysis.

lsub

A named list giving the subsets to be used in the analysis. The names of the list are the names of columns used to subset the dataset. Each element of the list is a vector giving the values to keep in the analysis for a given column. When a vector is NULL, all values are kept for this column. Default value lsub = NULL. See examples for further details.

stratum

When stratum = "STR_LABEL", the model will be stratified with the label. When stratum is the name of a column in the data, the model will be post-stratified according to this column (see Thomas et al. 2010). Default value is NULL.

split

When split = TRUE, separate models are ran according to the subsets determined by lsub. Default value FALSE.

rare

This argument is used when a species has few observations to estimate a detection function. The probability of detection is estimated from a group of similar species and a multiplier. A named list has to be given, with the column name where species are stored as the name of the list and the name of the species as an element (ex: list(Species = "HERG")). Default value NULL. See details.

period

A vector of characters of length 2 containing the extreme dates for which the analysis should be restricted. Dates have to be in the "yyyy-mm-dd" format.

detection

Currently, set to detection = "All". Can also be detection = "Stratum". Default value is "All".

monotone

Currently, set to monotone = "Strict". Can also be monotone = "Strict", "None", or "Weak". Default value is "Strict".

estimator

When set to NULL, the following key functions and expansion terms will be used: UN-CO, UN-PO, HN-CO, HN-HE, HA-CO and HA-PO. If the user wants to choose the key functions and the expansion terms used, a list has to be given with each element a vector of length 2 with the first element the key function and the second element the expansion term (ex: list(c("HN","CO"),c("HA","PO")). When rare != NULL, only one set is used (UN-CO) for the final specific model.

multiplier

Value by which the estimates of density or abundance are multiplied. The first value is the multiplier, the second the SE and the third the degree of freedom associated with the multiplier (useful when using the probability of detection as a multiplier in a two-step analyses with the second step). Default value when Type = "Line" and multiplier = NULL is set to multiplier = c(2,0,0) meaning only one-half of the transect is surveyed and the value is known with certainty with an infinite degree of freedom. When rare != NULL, the multiplier will be modified to account for the probability of detection. When Type = "Point" and multiplier = NULL, the multiplier is set to multiplier = c(2,0,0). Values provided by the user will override these default settings.

empty

Determine how empty transects are to be selected in the analysis. When empty = NULL, all empty transects are included for every element of lsub. For example, when models are splitted according to species, empty transects and transects where a species was not detected need to be considered in the analysis for that species. When lsub contains geographic or temporal elements, empty transects need to be restricted to the subsets given. In this case, a vector of character has to be given with the names in lsub for which the empty transects are to be restricted. When split = TRUE and empty != NULL, empty transects will be splitted according to the names in empty. In any case, it is assumed that the dataset contains at least a line for every transect executed, either with or without an observation. See examples for further details.

verbose

When set to TRUE, prints the input file given to the MCDS engine. Default value FALSE.

Details

Uses the MCDS engine from program Distance. The function produces an input file and submits it to the MCDS engine through the system function. The results are then extracted from the output files and returned as a list object.

Value

An object of class "distanceFit", when split = FALSE. When split = TRUE and lsub != NULL, a named list with the different models of class "distanceFit". Each object of class "distanceFit" is a named list with components model_fitting, parameter_estimates, chi_square_test, density_estimate, detection and path. Elements of the list are accessible through the $ operator. For each component except detection and path, a list of length 2 is given with component "Global" and "Stratum". Depending on the analysis chosen, one of these component will be empty (NULL).

Author

Francois Rousseu, Christian Roy, Francois Bolduc

References

Thomas, L., S.T. Buckland, E.A. Rexstad, J. L. Laake, S. Strindberg, S. L. Hedley, J. R.B. Bishop, T. A. Marques, and K. P. Burnham. 2010. Distance software: design and analysis of distance sampling surveys for estimating population size. Journal of Applied Ecology 47: 5-14. DOI: 10.1111/j.1365-2664.2009.01737.x

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
####################################################################
### Simple models without stratification based on line transect data
### Import and filter data
data(alcidae)
alcids <- mcds.filter(alcidae,
                     transect.id = "WatchID",
                     distance.field = "Distance",
                     distance.labels = c("A", "B", "C", "D"),
                     distance.midpoints = c(25, 75, 150, 250),
                     effort.field = "WatchLenKm", lat.field = "LatStart",
                     long.field = "LongStart",
                     sp.field = "Alpha",
                     date.field = "Date") 

### Run analysis with the MCDS engine. Here, the WatchID is used as the sample.
dist.out1 <- mcds.wrap(alcids,
                      SMP_EFFORT="WatchLenKm",
                      DISTANCE="Distance",
                      SIZE="Count",
                      Type="Line",
                      units=list(Distance="Perp",
                                 Length_units="Kilometers",
                                 Distance_units="Meters",
                                 Area_units="Square kilometers"),
                      breaks=c(0,50,100,200,300),
                      estimator=list(c("HN","CO")),
                      STR_LABEL="STR_LABEL",
                      STR_AREA="STR_AREA",
                      SMP_LABEL="WatchID",
                      path="c:/temp/distance",
                      pathMCDS="C:/Distance 7.2",
                      verbose=FALSE)

summary(dist.out1)

### Run separate analysis for years 2008-2009
alcids$Year <- substr(alcids$Date, start = 1, stop = 4)
alcids$Year <- as.numeric(alcids$Year)

dist.out2 <- mcds.wrap(alcids,
                      SMP_EFFORT="WatchLenKm",
                      DISTANCE="Distance",
                      SIZE="Count",
                      Type="Line",
                      units=list(Distance="Perp",
                                 Length_units="Kilometers",
                                 Distance_units="Meters",
                                 Area_units="Square kilometers"),
                      breaks=c(0,50,100,200,300),
                      estimator=list(c("HN","CO")),
                      lsub=list(Year=c(2007,2008)),
                      split=TRUE,
                      empty="Year",
                      STR_AREA="STR_AREA",
                      SMP_LABEL="WatchID",
                      path="c:/temp/distance",
                      pathMCDS="C:/Distance 7.2",
                      verbose=FALSE)

### Get the names of the different models produced
names(dist.out2)

#####summary for the Year 2008 model
summary(dist.out2[["2008"]])

#'####################################################################
### Simple models without stratification based on point transect data
library(AHMbook)
#########################
# Data simulation  #
########################
ll <- list()
j <- 1:100
al <- sample(300:1000, max(j))
for (i in j){
 simu.data <- sim.pdata(N = al[i],
                        sigma = 1,
                        B = 3,
                        keep.all = TRUE,
                        show.plot = TRUE)
 print(simu.data$N.real)
 tmp <- sim.pdata(N = al[i],
                  sigma = 1,
                  keep.all = FALSE,
                  show.plot = FALSE)
 ll[[i]] <- tmp
}
delta <- 0.5 # Width of distance bins
B <- 3 # Max count distance
dist.breaks <- seq(0, B, delta) # Make the interval cut points
dclass <- lapply(ll, function(i){
 i$d %/% delta + 1
})
nD <- length(dist.breaks) - 1 # How many intervals do you have
DF <- data.frame()
for(i in 1:length(ll)){
 transect.id <- rep(paste("sp1", i, sep = ""), length(dclass[[i]]))
 d.field <- dclass[[i]]
 effort.field <- rep(1, length(dclass[[i]])) 
 lat.field <- rep(47.0, length(dclass[[i]]))
 long.field <- rep((-45.24 + i), length(dclass[[i]]))
 sp.field <- rep("Piou", length(dclass[[i]]))
 date.field <- rep("2019-03-15", length(dclass[[i]])) 
 Count <- rep(1, length(dclass[[i]]))
 real.abun <- rep(ll[[i]]$N.real, length(dclass[[i]]))
 dt <- data.frame(transect.id ,
                  d.field,
                  effort.field,
                  lat.field,
                  long.field,
                  sp.field,
                  date.field,
                  Count,
                  real.abun)
 DF <- rbind(DF, dt)
}
# Convert numerical distance value to categorical with letters
DF$distance.field <- LETTERS[DF$d.field]
DF$distance.field <- as.factor(DF$distance.field)
summary(DF)
###################
###### Model ######
###################
library(R2MCDS)
DF.1 <- mcds.filter(DF,
                   transect.id = "transect.id",
                   distance.field = "distance.field",
                   distance.labels <- c("A", "B", "C", "D", "E", "F"),
                   distance.midpoints <- c(0.25, 0.75, 1.25, 1.75, 2.25, 2.75),
                   effort.field = "effort.field",
                   lat.field = "lat.field",
                   long.field = "long.field",
                   sp.field = "sp.field",
                   date.field = "date.field")
### Run analysis with the MCDS engine.
mod1 <- mcds.wrap(DF1,
                 SMP_EFFORT="WatchLenKm",
                 DISTANCE="Distance",
                 SIZE="Count",
                 Type="Point",
                 units=list(Distance="Radial",
                            Length_units="Meters",
                            Distance_units="Meters",
                            Area_units="Square meters"),
                  breaks=c(0, 0.5, 1, 1.50, 2, 2.50, 3),
                  SMP_LABEL="WatchID",
                  STR_LABEL="STR_LABEL",
                  STR_AREA="STR_AREA",
                  estimator=list(c("HN","CO")),
                  multiplier = c(1, 0, 0),
                  path="c:/temp/distance",
                  pathMCDS="C:/Distance 7.2",
                  verbose=FALSE)
mod1
summary(mod1)
plot.distanceFit(mod1)
#END

RoyChristian/R2MCDS documentation built on Jan. 13, 2020, 8:17 p.m.