make_extrapolation_info: Build extrapolation grid
In James-Thorson/VAST: Vector-Autoregressive Spatio-Temporal (VAST) Model

View source: R/make_extrapolation_info.R

make_extrapolation_info

R Documentation

Build extrapolation grid

Description

make_extrapolation_data builds an object used to determine areas to extrapolation densities to when calculating indices

Usage

make_extrapolation_info(
  Region,
  projargs = NA,
  zone = NA,
  strata.limits = data.frame(STRATA = "All_areas"),
  create_strata_per_region = FALSE,
  max_cells = NULL,
  input_grid = NULL,
  observations_LL = NULL,
  grid_dim_km = c(2, 2),
  maximum_distance_from_sample = NULL,
  grid_in_UTM = TRUE,
  grid_dim_LL = c(0.1, 0.1),
  region = c("south_coast", "west_coast"),
  strata_to_use = c("SOG", "WCVI", "QCS", "HS", "WCHG"),
  epu_to_use = c("All", "Georges_Bank", "Mid_Atlantic_Bight", "Scotian_Shelf",
    "Gulf_of_Maine", "Other")[1],
  survey = "Chatham_rise",
  surveyname = "propInWCGBTS",
  flip_around_dateline,
  nstart = 100,
  area_tolerance = 0.05,
  backwards_compatible_kmeans = FALSE,
  DirPath = getwd(),
  ...
)

Arguments

`Region`	a character vector, where each element is matched against potential values to create the extrapolation grid, where densities are then predicted at the midpoint of each grid cell when calculating derived quantities or visualizing model outputs. Users will typically supply a single character-string, representing the footprint of a single survey. However, it is also possible to provide a character-vector, where the extrapolation-grid will be created for each string, and then combined together; this is then helpful when extrapolating densities across multiple survey domains. Current options are: `"user"` User defined extrapolation-grid; also requires input `input_grid`. Example of building from points or shapefile can be found at https://github.com/James-Thorson-NOAA/VAST/wiki/Creating-an-extrapolation-grid the path and name for a shapefile, i.e., `paste0(shapedir,"Shape.shp")` Create an extrapolation-grid upon runtime by creating a grid within a user-supplied shapefile, using `grid_dim_km` to determine grid resolution `"california_current"` The spatial fooprint of the bottom trawl surveys operated by AFSC/NWFSC, including the AFSC triennial from 1977-2004 and the NWFSC combined shelf-slope survey from 2003 onward (as identified by B. Feist and C. Whitmire); specify subsets via `surveyname` `"west_coast_hook_and_line"` The spatial fooprint of the fixed-station hook-and-line survey in the California Bight operated by NWFSC (as identified by J. Harms) `"british_columbia"` The spatial fooprint of the various stratified-random bottom trawl surveys operated by PBS (as identified by N. Olsen); see `strata_to_use` for further specification `"eastern_bering_sea"` The spatial fooprint of the fixed station bottom trawl survey operated by AFSC in the eastern Bering Sea (as identified by J. Conner) `"northern_bering_sea"` The spatial fooprint of the systematic bottom trawl survey operated by AFSC in the northern Bering Sea (as identified by J. Conner) `"bering_sea_slope"` The spatial fooprint of the stratified random bottom trawl survey operated by AFSC in the Bering Sea slope (as identified by A. Greig) `"chukchi_sea"` The spatial fooprint of the systematic bottom trawl survey operated by AFSC in the Bering Sea slope (as identified by J. Conner) `"st_matthews_island"` The spatial fooprint of the survey area defined around St. Matthews Island, representing regular and corner stations from the eastern Bering Sea bottom trawl survey `"aleutian_islands"` The spatial fooprint of the stratified random bottom trawl survey operated by AFSC in the Aleutian Islands (as identified by A. Greig) `"gulf_of_alaska"` The spatial fooprint of the stratified random bottom trawl survey operated by AFSC in the Gulf of Alaska and containing shallow and deep stations, where the latter are not consistently sampled in later years (as identified by A. Greig) `"BFISH_MHI"` The spatial fooprint of the visual sampling of reef fishes in the main Hawaiian Islands (as provided by B. Richards) `"CalCOFI-IMECOCAL_Winter-Spring"` The spatial fooprint of the fixed station ichthyoplankton sampling design operated by CalCOFI and IMECOCAL, in a typical year during Winter and Spring months (as identified by A. Thompson) `"CalCOFI_Winter-Spring"` The spatial fooprint of the fixed station ichthyoplankton sampling design operated by CalCOFI, in a typical year during Winter and Spring months (as identified by A. Thompson) `"IMECOCAL_Winter-Spring"` The spatial fooprint of the fixed station ichthyoplankton sampling design operated by IMECOCAL, in a typical year during Winter and Spring months (as identified by A. Thompson) `"CalCOFI-IMECOCAL_Summer"` The spatial fooprint of the fixed station ichthyoplankton sampling design operated by CalCOFI and IMECOCAL, in a typical year during Summer months (as identified by A. Thompson) `"rockfish_recruitment_coastwide"` The spatial fooprint of the fixed station juvenile rockfish survey operated by SWFSC across its expanded spatial extent that is sampled during recent years (as identified by J. Field) `"rockfish_recruitment_core"` The spatial fooprint of the fixed station juvenile rockfish survey operated by SWFSC within its core spatial extent that is sampled consistently throughout its entire operations (as identified by J. Field) `"northwest_atlantic"` The spatial fooprint of the stratified random bottom trawl survey operated by NEFSC in the Northwest Altantic (as identified by D. Chevrier); see `epu_to_use` for further subdivisions `"south_africa"` The spatial fooprint of the stratified random bottom trawl survey operated by DAFF in the West or South Coast of South Africa (as identified by H. Winker); see `region` to select between South and West Coast surveys `"gulf_of_st_lawrence"` The spatial fooprint of the survey operated by DFO in Gulf of St. Lawrence (as identified by H. Benoit) `"new_zealand"` The spatial fooprint of the bottom trawl survey operated by NIWA in Chatham Rise (as identified by V. McGregor) `"habcam"` The spatial fooprint of the visual trawl survey for scallops operated by NEFSC (as identified by D. Hart) `"gulf_of_mexico"` The US Gulf of Mexico, surveyed by various fishery-independent surveys; using a definition provided by A. Gruss `"ATL-IBTS-Q1", "ATL-IBTS-Q4", "BITS", "BTS", "BTS-VIIA", "EVHOE", "IE-IGFS", "NIGFS", "NS_IBTS", "PT-IBTS", "SP-ARSA", "SP-NORTH", "SP-PORC"` ICES survey domains as defined by shapefiles provided by M. Lindegren as originated by ICES Secretariat `"stream_network"` Specifying a stream network for use when `Method="Stream_network"` `"other"` Automated creation of an extrapolation-grid by padding an area around observations (not recommended for operational use)
`projargs`	A character string of projection arguments; the arguments must be entered exactly as in the PROJ.4 documentation; if the projection is unknown, use `as.character(NA)`, it may be missing or an empty string of zero length and will then set to the missing value. With rgdal built with PROJ >= 6 and GDAL >= 3, the `+init=` key may only be used with value `epsg:<code>`. From sp version 1.4-4, the string associated with the SRS_string argument may be entered as-is and will be set as SRS_string if the projargs argument does not begin with a `+` (suggested by Mikko Vihtakari).
`zone`	UTM zone used for projecting Lat-Lon to km distances; use `zone=NA` by default to automatically detect UTM zone from the location of extrapolation-grid samples
`strata.limits`	an input for determining stratification of indices (see example script)
`create_strata_per_region`	Boolean indicating whether to create a single stratum for all regions listed in `Region` (the default), or a combined stratum in addition to a stratum for each individual Region
`max_cells`	Maximum number of extrapolation-grid cells. If number of cells in extrapolation-grid is less than this number, then its value is ignored. Default `max_cells=Inf` results in no reduction in number of grid cells from the default extrapolation-grid for a given region. Using a lower value is particularly useful when `fine_scale=TRUE` and using epsilon bias-correction, such that the number of extrapolation-grid cells is often a limiting factor in estimation speed.
`input_grid`	a matrix with three columns (labeled `'Lat', 'Lon'`, and `'Area_km2'`) giving latitude, longitude, and area for each cell of a user-supplied grid; only used when `Region="user"`
`observations_LL`	a matrix with two columns (labeled 'Lat' and 'Lon') giving latitude and longitude for each observation; only used when `Region="other"`
`grid_dim_km`	numeric-vector with length two, giving the distance in km between cells in the automatically generated extrapolation grid; only used if `Region="other"`
`maximum_distance_from_sample`	maximum distance that an extrapolation grid cell can be from the nearest sample and still be included in area-weighted extrapolation of density; only used if `Region="other"`
`grid_in_UTM`	Boolean stating whether to automatically generate an extrapolation grid based on sampling locations in km within the UTM projection of within Lat-Lon coordinates; only used if `Region="other"`
`grid_dim_LL`	same as `grid_dim_km` except measured in latitude-longitude coordinates; only used if `Region="other"`
`region`	which coast to use for South Africa extrapolation grid; only used if `Region="south_africa"`
`strata_to_use`	strata to include by default for the BC coast extrapolation grid; only used if `Region="british_columbia"`
`epu_to_use`	EPU to include for the Northwest Atlantic (NWA) extrapolation grid, default is "All"; only used if `Region="northwest_atlantic"`
`survey`	survey to use for New Zealand extrapolation grid; only used if `Region="new_zealand"`
`surveyname`	area of West Coast to include in area-weighted extrapolation for California Current; only used if `Region="california_current"`. Options are: `surveyname="propInWCGBTS"` The proportion of each extrapolation-grid cell within the annual shelf-slope survey operated 2003 to present (the default) `surveyname="propInTriennial"` The proportion of each extrapolation-grid cell within the triennial slope survey operated 1977-2004
`flip_around_dateline`	DEPRECATED INPUT; boolean specifying whether to flip Lat-Lon locations around the dateline, and then retransform back (only useful if Lat-Lon straddle the dateline)
`nstart`	the number of times that the k-means algorithm is run while searching for the best solution (default=100)
`backwards_compatible_kmeans`	a boolean stating how to deal with changes in the kmeans algorithm implemented in R version 3.6.0, where `backwards_compatible_kmeans==TRUE` modifies the default algorithm to maintain backwards compatibility, and where `backwards_compatible_kmeans==FALSE` breaks backwards compatibility between R versions prior to and after R 3.6.0.
`DirPath`	a directory where the function looks for a previously-saved output (default is working directory)
`...`	other objects passed for individual regions (see example script)

Details

To do area-weighted extrapolation of estimated density for use in calculating abundance indices, it is necessary to have a precise measurement of the footprint for a given survey design. Using VAST, analysts do this by including an "extrapolation grid" where densities are predicted at the location of each grid cell and where each grid cell is associated with a known area within a given survey design. Collaborators have worked with the package author to include the extrapolation-grid for several regions automatically in VAST. For new regions an analyst can either (1) detect the grid automatically using Region="Other", or (2) input an extrapolation-grid manually using Region="User", or supply a GIS shapefile Region="[directory_path/file_name].shp". The extrapolation is also used to determine where to drawn pixels when plotting predictions of density. If a user supplies a character-vector with more than one of these, then they are combined to assemble a combined extrapolation-grid.

When supplying a shapefile, I recommend using a UTM projection for projargs, which appears to have lower projection errors regarding total area than rnaturalearth.

Value

Tagged list used in other functions

a_el: The area associated with each extrapolation grid cell (rows) and strata (columns)
Data_Extrap: A data frame describing the extrapolation grid
zone: the zone used to convert Lat-Long to UTM by PBSmapping package
flip_around_dateline: a boolean stating whether the Lat-Long is flipped around the dateline during conversion to UTM
Area_km2_x: the area associated with each row of Data_Extrap, in units square-kilometers