estMC: Estimate migratory connectivity
In SMBC-NZP/MigConnectivity: Estimate Migratory Connectivity for Migratory Animals

estMC

R Documentation

Estimate migratory connectivity

Description

Resampling of uncertainty for migratory connectivity strength (MC) and transition probabilities (psi) from RMark psi matrix estimates or samples of psi and/or JAGS relative abundance MCMC samples OR SpatialPoints geolocators and/or GPS data OR intrinsic markers such as isotopes. NOTE: active development of this function is ending. We suggest users estimate psi with estTransition, MC with estStrength, and Mantel correlations (rM) with estMantel.

Usage

estMC(
  originDist,
  targetDist = NULL,
  originRelAbund,
  psi = NULL,
  sampleSize = NULL,
  originSites = NULL,
  targetSites = NULL,
  originPoints = NULL,
  targetPoints = NULL,
  originAssignment = NULL,
  targetAssignment = NULL,
  originNames = NULL,
  targetNames = NULL,
  nSamples = 1000,
  nSim = ifelse(isTRUE(isIntrinsic), 10, 1000),
  isGL = FALSE,
  geoBias = NULL,
  geoVCov = NULL,
  row0 = 0,
  verbose = 0,
  calcCorr = FALSE,
  alpha = 0.05,
  approxSigTest = FALSE,
  sigConst = 0,
  resampleProjection = "ESRI:102010",
  maxTries = 300,
  targetIntrinsic = NULL,
  isIntrinsic = FALSE,
  maintainLegacyOutput = FALSE
)

Arguments

`originDist`	Distances between the B origin sites. Symmetric B by B matrix
`targetDist`	Distances between the W target sites. Symmetric W by W matrix. Optional for intrinsic data
`originRelAbund`	Relative abundance estimates at B origin sites. Either a numeric vector of length B that sums to 1 or an mcmc object with `nSamples` rows and columns including 'relN[1]' through 'relN[B]'. Currently, an mcmc object doesn't work with geolocator, GPS, or intrinsic data
`psi`	Transition probabilities between B origin and W target sites. Either a matrix with B rows and W columns where rows sum to 1, an array with dimensions x, B, and W (with x samples of the transition probability matrix from another model), or a MARK object with estimates of transition probabilities. If you are estimating MC from GPS, geolocator, or intrinsic data, leave this as NULL
`sampleSize`	Total sample size of animals that psi will be estimated from. Should be the number of animals released in one of the origin sites and observed in one of the target sites. Optional, but recommended, unless you are estimating MC from GPS, geolocator, intrinsic, or direct band return data (in which case the function can calculate it for you)
`originSites`	If `psi` is a MARK object, this must be a numeric vector indicating which sites are origin. If using GPS, geolocator, or intrinsic data, this can be the geographic definition of sites in the release season
`targetSites`	If `psi` is a MARK object, this must be a numeric vector indicating which sites are target. If using GPS, geolocator, or intrinsic data, this must be the geographic definition of sites in the non-release season. Optional for intrinsic data; if left out, the function will use the `targetSites` defined in `targetIntrinsic`
`originPoints`	A `POINT` sf object, with length number of animals tracked. Each point indicates the release location of an animal
`targetPoints`	For GL or GPS data, a `POINT` sf object, with length number of animals tracked. Each point indicates the point estimate location in the non-release season
`originAssignment`	Assignment of `originPoints` to release season sites. Integer vector with length number of animals tracked. Optional, but if using GL or GPS data, either `originAssignment` or `originSites` and `originPoints` should be defined
`targetAssignment`	Optional. Point estimate assignment of `targetPoints` to non-release season sites. Integer vector with length number of animals tracked
`originNames`	Optional but recommended. Vector of names for the release season sites
`targetNames`	Optional but recommended. Vector of names for the non-release season sites
`nSamples`	Number of times to resample `psi` and/or `originRelAbund` OR number of post-burn-in MCMC samples to store (band data) OR number of times to resample `targetPoints` for intrinsic data OR number of bootstrap runs for GL or GPS data. In the last two cases, animals are sampled with replacement for each. For all, the purpose is to estimate sampling uncertainty
`nSim`	Tuning parameter for GL or intrinsic data. Affects only the speed; 1000 seems to work well with our GL data and 10 for our intrinsic data, but your results may vary. Should be integer > 0
`isGL`	Indicates whether or which animals were tracked with geolocators. Should be either single TRUE or FALSE value, or vector with length of number of animals tracked, with TRUE for animals in `targetPoints` with geolocators and FALSE for animals with GPS
`geoBias`	For GL data, vector of length 2 indicating expected bias in longitude and latitude of `targetPoints`, in `resampleProjection` units (default meters)
`geoVCov`	For GL data, 2x2 matrix with expected variance/covariance in longitude and latitude of `targetPoints`, in `resampleProjection` units (default meters)
`row0`	If `originRelAbund` is an mcmc object, this can be set to 0 (default) or any greater integer to specify where to stop ignoring samples ("burn-in")
`verbose`	0 (default) to 3. 0 prints no output during run. 1 prints a line every 100 samples or bootstraps and a summary every 10. 2 prints a line and summary every sample or bootstrap. 3 also prints the number of draws (for tuning nSim for GL/intrinsic data only)
`calcCorr`	In addition to MC, should function also estimate Mantel correlation between release and non-release locations (GPS or GL data only)? Default is FALSE
`alpha`	Level for confidence/credible intervals provided
`approxSigTest`	Should function compute approximate one-sided significance tests (p-values) for MC from the bootstrap? Default is FALSE
`sigConst`	Value to compare MC to in significance test. Default is 0
`resampleProjection`	Projection when sampling from geolocator bias/error. This projection needs units = m. Default is Equidistant Conic. The default setting preserves distances around latitude = 0 and longitude = 0. Other projections may work well, depending on the location of `targetSites`. Ignored unless data are geolocator or GPS
`maxTries`	Maximum number of times to run a single GL/intrinsic bootstrap before exiting with an error. Default is 300. Set to NULL to never stop. This parameter was added to prevent GL setups where some sample points never land on target sites from running indefinitely
`targetIntrinsic`	For intrinsic tracking data, the results of `isoAssign` or a similar function, of class `intrinsicAssign`
`isIntrinsic`	Logical indicating whether the animals are tracked via intrinsic marker (e.g. isotopes) or not. Currently estMC will only estimate connectivity for all intrinsically marked animals or all extrinsic (e.g., bands, GL, or GPS), so isIntrinsic should be a single TRUE or FALSE
`maintainLegacyOutput`	version 0.4.0 of `MigConnectivity` updated the structure of the estimates. If you have legacy code that refers to elements within a `estMigConnectivity` object, you can set this to TRUE to also keep the old structure. Defaults to FALSE

Value

NOTE: Starting with version 0.4.0 of MigConnectivity, we've updated the structure of MigConnectivityEstimate objects. Below we describe the updated structure. If parameter maintainLegacyOutput is set to TRUE, the list will start with the old structure: sampleMC, samplePsi, pointPsi, pointMC, meanMC, medianMC, seMC, simpleCI, bcCI, hpdCI, simpleP, bcP, sampleCorr, pointCorr, meanCorr, medianCorr, seCorr, simpleCICorr, bcCICorr, inputSampleSize, alpha, and sigConst.

estMC returns a list with the elements:

psi

List containing estimates of transition probabilities:

sample Array of sampled values for psi. nSamples x [number of origin sites] x [number of target sites]. Provided to allow the user to compute own summary statistics.
mean Main estimate of psi matrix. [number of origin sites] x [number of target sites].
se Standard error of psi, estimated from SD of psi$sample.
simpleCI 1 - alpha confidence interval for psi, estimated as alpha/2 and 1 - alpha/2 quantiles of psi$sample.
bcCI Bias-corrected 1 - alpha confidence interval for psi. Preferable to simpleCI when mean is the best estimate of psi. simpleCI is preferred when median is a better estimator. When meanMC==medianMC, these should be identical. Estimated as the pnorm(2 * z0 + qnorm(alpha / 2)) and pnorm(2 * z0 + qnorm(1 - alpha / 2)) quantiles of sample, where z0 is the proportion of sample < mean.
median Median estimate of psi matrix.
point Simple point estimate of psi matrix, not accounting for sampling error. NULL when isIntrinsic == TRUE.

MC

List containing estimates of migratory connectivity strength:

sample nSamples sampled values for MC. Provided to allow the user to compute own summary statistics.
mean Mean of MC$sample. Main estimate of MC, incorporating parametric uncertainty.
se Standard error of MC, estimated from SD of MC$sample.
simpleCI Default1 - alpha confidence interval for MC, estimated as alpha/2 and 1 - alpha/2 quantiles of MC$sample.
bcCI Bias-corrected 1 - alpha confidence interval for MC. Preferable to MC$simpleCI when MC$mean is the best estimate of MC. MC$simpleCI is preferred when MC$median is a better estimator. When MC$mean==MC$median, these should be identical. Estimated as the pnorm(2 * z0 + qnorm(alpha / 2)) and pnorm(2 * z0 + qnorm(1 - alpha / 2)) quantiles of MC$sample, where z0 is the proportion of MC$sample < MC$mean.
hpdCI 1 - alpha credible interval for MC, estimated using the highest posterior density (HPD) method.
median Median of MC, alternate estimator also including parametric uncertainty.
point Simple point estimate of MC, using the point estimates of psi and originRelAbund, not accounting for sampling error. NULL when isIntrinsic == TRUE.
simpleP Approximate p-value for MC, estimated as the proportion of bootstrap iterations where MC < sigConst (or MC > sigConst if pointMC < sigConst). Note that if the proportion is 0, a default value of 0.5 / nSamples is provided, but this is best interpreted as p < 1 / nSamples. NULL when approxSigTest==FALSE.
bcP Approximate bias-corrected p-value for MC, estimated as pnorm(qnorm(simpleP) - 2 * z0), where z0 is the proportion of sampleMC < meanMC. May be a better approximation of the p-value than simpleP, but many of the same limitations apply. NULL when approxSigTest==FALSE.

corr

List containing estimates of rM, an alternate measure of migratory connectivity strength. NULL when calcCorr==FALSE or !is.null(psi):

sample nBoot sampled values for continuous correlation. Provided to allow the user to compute own summary statistics.
mean, se, simpleCI, bcCI, median, point Summary statistics for continuous correlation bootstraps.

input

List containing the inputs to estMC, or at least the relevant ones, such as sampleSize.

References

Cohen, E. B., J. A. Hostetler, M. T. Hallworth, C. S. Rushing, T. S. Sillett, and P. P. Marra. 2018. Quantifying the strength of migratory connectivity. Methods in Ecology and Evolution 9: 513 - 524. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/2041-210X.12916")}

Cohen, E. B., C. S. Rushing, F. R. Moore, M. T. Hallworth, J. A. Hostetler, M. Gutierrez Ramirez, and P. P. Marra. 2019. The strength of migratory connectivity for birds en route to breeding through the Gulf of Mexico. Ecography 42: 658–669. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/ecog.03974")}

Examples


  set.seed(101)
  # Uncertainty in detection ('RMark' estimates) with equal abundances
  # Number of resampling iterations for generating confidence intervals

  nSamplesCMR <- 100
  nSimulationsCMR <- 10
  originPos13 <- matrix(c(rep(seq(-99, -81, 2), each = 10),
                          rep(seq(49, 31, -2), 10)), 100, 2)
  targetPos13 <- matrix(c(rep(seq(-79, -61, 2), each = 10),
                          rep(seq(9, -9, -2), 10)), 100, 2)
  originPosCMR <- rowsum(originPos13, c(rep(1:2, 5, each = 5),
                                        rep(3:4, 5, each = 5))) / 25
  originPosCMR
  targetPosCMR <- rowsum(targetPos13, c(rep(1:2, 5, each = 5),
                                        rep(3:4, 5, each = 5))) / 25
  targetPosCMR

  originDist <- distFromPos(originPosCMR, 'ellipsoid')
  targetDist <- distFromPos(targetPosCMR, 'ellipsoid')
  originRelAbundTrue <- rep(0.25, 4)
  # the second intermediate psi scenario, the "low" level
  psiTrue <- samplePsis[["Low"]]
  trueMC <- calcMC(originDist, targetDist, originRelAbundTrue, psiTrue)
  trueMC

  # Storage matrix for samples
  cmrMCSample <- matrix(NA, nSamplesCMR, nSimulationsCMR)
  summaryCMR <- data.frame(Simulation = 1:nSimulationsCMR, True=trueMC,
                           mean=NA, se=NA, lcl=NA, ucl=NA)
  # Get RMark psi estimates and estimate MC from each
  for (r in 1:nSimulationsCMR) {
    cat("Simulation",r,"of",nSimulationsCMR,"\n")
    # Note: getCMRexample() requires a valid internet connection and that GitHub
    # is accessible
    fm <- getCMRexample(r)
    results <- estMC(originRelAbund = originRelAbundTrue, psi = fm,
                     originDist = originDist, targetDist = targetDist,
                     originSites = 5:8, targetSites = c(3,2,1,4),
                     nSamples = nSamplesCMR, verbose = 0,
                     sampleSize = length(grep('[2-5]', fm$data$data$ch)))
    #sampleSize argument not really needed (big sample sizes)
    cmrMCSample[ , r] <- results$MC$sample
    summaryCMR$mean[r] <- results$MC$mean
    summaryCMR$se[r] <- results$MC$se
    # Calculate confidence intervals using quantiles of sampled MC
    summaryCMR[r, c('lcl', 'ucl')] <- results$MC$simpleCI
  }

  summaryCMR <- transform(summaryCMR, coverage = (True>=lcl & True<=ucl))
  summaryCMR
  summary(summaryCMR)
  biasCMR <- mean(summaryCMR$mean) - trueMC
  biasCMR
  mseCMR <- mean((summaryCMR$mean - trueMC)^2)
  mseCMR
  rmseCMR <- sqrt(mseCMR)
  rmseCMR


  # Simulation of BBS data to quantify uncertainty in relative abundance

  nSamplesAbund <- 700 #1700 are stored
  nSimulationsAbund <- 10
  # Storage matrix for samples
  abundMCSample <- matrix(NA, nSamplesAbund, nSimulationsAbund)
  summaryAbund <- data.frame(Simulation = 1:nSimulationsAbund, True = trueMC,
                             mean = NA, se = NA, lcl = NA, ucl = NA)
  for (r in 1:nSimulationsAbund) {
    cat("Simulation",r,"of",nSimulationsAbund,"\n")
    row0 <- nrow(abundExamples[[r]]) - nSamplesAbund
    results <- estMC(originRelAbund = abundExamples[[r]], psi = psiTrue,
                     originDist = originDist, targetDist = targetDist,
                     row0 = row0, nSamples = nSamplesAbund, verbose = 1)
    abundMCSample[ , r] <- results$MC$sample
    summaryAbund$mean[r] <- results$MC$mean
    summaryAbund$se[r] <- results$MC$se
    # Calculate confidence intervals using quantiles of sampled MC
    summaryAbund[r, c('lcl', 'ucl')] <- results$MC$simpleCI
  }

  summaryAbund <- transform(summaryAbund,
                            coverage = (True >= lcl & True <= ucl))
  summaryAbund
  summary(summaryAbund)
  biasAbund <- mean(summaryAbund$mean) - trueMC
  biasAbund
  mseAbund <- mean((summaryAbund$mean - trueMC)^2)
  mseAbund
  rmseAbund <- sqrt(mseAbund)
  rmseAbund

  # Ovenbird example with GL and GPS data
  data(OVENdata) # Ovenbird

  nSamplesGLGPS <- 100 # Number of bootstrap iterations

  # Estimate MC only, treat all data as geolocator
  GL_mc<-estMC(isGL=TRUE, # Logical vector: light-level geolocator(T)/GPS(F)
               geoBias = OVENdata$geo.bias, #Geolocator location bias
               geoVCov = OVENdata$geo.vcov, # Location covariance matrix
               targetDist = OVENdata$targetDist, # targetSites distance matrix
               originDist = OVENdata$originDist, # originSites distance matrix
               targetSites = OVENdata$targetSites, # Non-breeding target sites
               originSites = OVENdata$originSites, # Breeding origin sites
               originPoints = OVENdata$originPoints, # Capture Locations
               targetPoints = OVENdata$targetPoints, # Device target locations
               originRelAbund = OVENdata$originRelAbund,#Origin relative abund.
               verbose = 1,   # output options
               nSamples = nSamplesGLGPS,# This is set low for example
               resampleProjection = terra::crs(OVENdata$targetSites))

  # Estimate MC and rM, treat all data as is
  Combined<-estMC(isGL=OVENdata$isGL, #Logical vector:light-level GL(T)/GPS(F)
                  geoBias = OVENdata$geo.bias, # Light-level GL location bias
                  geoVCov = OVENdata$geo.vcov, # Location covariance matrix
                  targetDist = OVENdata$targetDist, # Winter distance matrix
                  originDist = OVENdata$originDist, # Breeding distance matrix
                  targetSites = OVENdata$targetSites, # Nonbreeding/target sites
                  originSites = OVENdata$originSites, # Breeding origin sites
                  originPoints = OVENdata$originPoints, # Capture Locations
                  targetPoints = OVENdata$targetPoints, #Device target locations
                  originRelAbund = OVENdata$originRelAbund,#Relative abundance
                  verbose = 1,   # output options
                  calcCorr = TRUE, # estimate rM as well
                  nSamples = nSamplesGLGPS, # This is set low for example
                  approxSigTest = TRUE,
                  resampleProjection = terra::crs(OVENdata$targetSites),
                  originNames = OVENdata$originNames,
                  targetNames = OVENdata$targetNames)

  print(Combined)

  # For treating all data as GPS,
  # Move the latitude of birds with locations that fall offshore - only change
  # Latitude
  int <- sf::st_intersects(OVENdata$targetPoints, OVENdata$targetSites)
  any(lengths(int)<1)
  plot(OVENdata$targetPoints)
  plot(OVENdata$targetSites,add=TRUE)
  tp<-sf::st_coordinates(OVENdata$targetPoints)
  text(tp[,1], tp[,2], label=c(1:39))

  tp[5,2]<- -1899469
  tp[10,2]<- -1927848
  tp[1,2]<- -1927930
  tp[11,2]<- -2026511
  tp[15,2]<- -2021268
  tp[16,2]<- -1976063

  oven_targetPoints<-sf::st_as_sf(as.data.frame(tp),
                                  coords = c("X","Y"),
                                  crs = sf::st_crs(OVENdata$targetPoints))
  inter <- sf::st_intersects(oven_targetPoints, OVENdata$targetSites)
  any(lengths(inter)<1)
  plot(oven_targetPoints,add=TRUE, col = "green")

  # Estimate MC only, treat all data as GPS
  GPS_mc<-estMC(isGL=FALSE, # Logical vector: light-level geolocator(T)/GPS(F)
                targetDist = OVENdata$targetDist, # targetSites distance matrix
                originDist = OVENdata$originDist, # originSites distance matrix
                targetSites = OVENdata$targetSites, # Non-breeding target sites
                originSites = OVENdata$originSites, # Breeding origin sites
                originPoints = OVENdata$originPoints, # Capture Locations
                targetPoints = oven_targetPoints, # Device target locations
                originRelAbund = OVENdata$originRelAbund,#Origin relative abund.
                verbose = 1,   # output options
                nSamples = nSamplesGLGPS) # This is set low for example

  str(GPS_mc, max.level = 2)
  str(Combined, max.level = 2)
  str(GL_mc, max.level = 2)
  plot(Combined, legend = "top", main = "Ovenbird GL and GPS")
  text(1.1, 0.98, cex = 1,
       labels = paste("MC = ", round(Combined$MC$mean, 2), "+/-",
                      round(Combined$MC$se, 2)))


  # Generate probabilistic assignments using intrinsic markers (stable-hydrogen
  # isotopes)
  getCSV <- function(filename) {
    tmp <- tempdir()
    url1 <- paste0('https://github.com/SMBC-NZP/MigConnectivity/blob/master/data-raw/',
                   filename, '?raw=true')
    temp <- paste(tmp, filename, sep = '/')
    utils::download.file(url1, temp, mode = 'wb')
    csv <- read.csv(temp)
    unlink(temp)
    return(csv)
  }

  getRDS <- function(speciesDist) {
    tmp <- tempdir()
    extension <- '.rds'
    filename <- paste0(speciesDist, extension)
    url1 <- paste0(
      'https://github.com/SMBC-NZP/MigConnectivity/blob/master/data-raw/Spatial_Layers/',
                   filename, '?raw=true')
    temp <- paste(tmp, filename, sep = '/')
    utils::download.file(url1, temp, mode = 'wb')
    shp <- readRDS(temp)
    unlink(temp)
    return(shp)
  }
  OVENdist <- getRDS("OVENdist")

  OVENvals <- getCSV("deltaDvalues.csv")

  OVENvals <- OVENvals[grep(x=OVENvals$Sample,"NH", invert = TRUE),]

  originSites <- getRDS("originSites")
  originSites <- sf::st_as_sf(originSites)

  EVER <- length(grep(x=OVENvals$Sample,"EVER"))
  JAM <- length(grep(x=OVENvals$Sample,"JAM"))

  originRelAbund <- matrix(c(EVER,JAM),nrow = 1,byrow = TRUE)
  originRelAbund <- prop.table(originRelAbund,1)

  op <- sf::st_centroid(originSites)

  originPoints <- array(NA,c(EVER+JAM,2), list(NULL, c("x","y")))
  originPoints[grep(x = OVENvals$Sample,"JAM"),1] <- sf::st_coordinates(op)[1,1]
  originPoints[grep(x = OVENvals$Sample,"JAM"),2] <- sf::st_coordinates(op)[1,2]
  originPoints[grep(x = OVENvals$Sample,"EVER"),1] <-sf::st_coordinates(op)[2,1]
  originPoints[grep(x = OVENvals$Sample,"EVER"),2] <-sf::st_coordinates(op)[2,2]

  originPoints <- sf::st_as_sf(data.frame(originPoints),
                               coords = c("x", "y"),
                               crs = sf::st_crs(originSites))
  originDist <- distFromPos(sf::st_coordinates(op))


  iso <- isoAssign(isovalues = OVENvals[,2],
                   isoSTD = 12,       # this value is for demonstration only
                   intercept = -10,   # this value is for demonstration only
                   slope = 0.8,       # this value is for demonstration only
                   odds = NULL,
                   restrict2Likely = TRUE,
                   nSamples = 1000,
                   sppShapefile = OVENdist,
                   assignExtent = c(-179,-60,15,89),
                   element = "Hydrogen",
                   period = "GrowingSeason",#this setting for demonstration only
                   seed = 12345,
                   verbose=1)

  targetSites <- sf::st_as_sf(iso$targetSites)
  targetSites <- sf::st_make_valid(targetSites)
  targetSites <- sf::st_union(targetSites, by_feature = TRUE)

  ovenMC <- estMC(originRelAbund = originRelAbund,
                  targetIntrinsic = iso,
                  originPoints = originPoints,
                  originSites = originSites,
                  originDist = originDist,
                  nSamples = 50, # set very low for example speed
                  verbose = 1,
                  calcCorr = TRUE,
                  alpha = 0.05,
                  approxSigTest = FALSE,
                  isIntrinsic = TRUE,
                  targetSites = targetSites)

  ovenMC

SMBC-NZP/MigConnectivity documentation built on March 26, 2024, 4:22 p.m.