knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-figs/",
  cache.path = "README-cache/"
)

getais

The getais R package is simply a data processing package, meant to download and process the very large AIS datasets from MarineCadastre.gov.

You can install the development version of the package with:

# install.packages("devtools")
if("getais" %in% rownames(installed.packages()) == FALSE) {
  devtools::install_github("ericward-noaa/getais")
}
library(getais)

An example processing script

Create data frame for all combinations

if (!file.exists("filtered")) dir.create("filtered")

all_combos = expand.grid("zone"=1:19,"year"=2009:2014,"month"=1:12)

As an example, we'll only use data from December and UTM Zone 9.

subset = all_combos[which(all_combos$zone==9 & all_combos$month==12),]

head(subset)

Process the data for these combinations of months / years / zones. The user can also specify the status codes to keep, which defaults to

status_codes_to_keep = c(0, 7, 8, 9, 10, 11, 12, 13, 14, 15)

More details here: https://help.marinetraffic.com/hc/en-us/articles/203990998-What-is-the-significance-of-the-AIS-Navigational-Status-Values-#'

The additional information that can be joined in relates to the vessel and voyage. The vessel attributes can be specified with vessel_attr. These default to

vessel_attr = c("VesselType","Length")

but can be changed, and other fields may be included ("IMO", "CallSign", "Name", "Width", "DimensionComponents"). The voyage attributes voyage_attr default to just including destination,

voyage_attr = c("Destination")

but also allows other variables ("VariableID", "Cargo", "Draught", "ETA", "StartTime", "EndTime").

Running the function

The downsample_ais() function also takes the 'every_minutes' argument, which finds observations from each vessel that are closest to that time. For example, if we wanted observations nearest to every 30 minutes, we'd use

downsample_ais(df = subset, every_minutes = 30, raw = TRUE)

And now we have a few hundered or thousand records per tab-delimitted output files in the 'filtered' folder,

dir("filtered")

Using the 30-minute sampling of zone 3 for example, each year has < 10000 records.

Additional tests for Blake's data,

downsample_ais(df = data.frame("year"=2009,"zone"=11,"month"=10), every_minutes = 30)
downsample_ais(df = data.frame("year"=2012,"zone"=3,"month"=10), every_minutes = 30)
downsample_ais(df = data.frame("year"=2014,"zone"=10,"month"=10), every_minutes = 30)


eric-ward/getais documentation built on Sept. 22, 2020, 1:48 a.m.