The getais R package is simply a data processing package, meant to download and process the very large AIS datasets from MarineCadastre.gov.
You can install the development version of the package with:
# install.packages("devtools")
devtools::install_github("eric-ward/getais")
library(getais)
Create data frame for all combinations
if (!file.exists("filtered")) dir.create("filtered")
all_combos = expand.grid("zone"=1:19,"year"=2009:2014,"month"=1:12)
As an example, we'll only use data from December and UTM Zone 3.
subset = all_combos[which(all_combos$zone==3 & all_combos$month==12),]
head(subset)
#> zone year month
#> 1257 3 2009 12
#> 1276 3 2010 12
#> 1295 3 2011 12
#> 1314 3 2012 12
#> 1333 3 2013 12
#> 1352 3 2014 12
Process the data for these combinations of months / years / zones. The user can also specify the status codes to keep, which defaults to
status_codes_to_keep = c(0, 7, 8, 9, 10, 11, 12, 13, 14, 15)
More details here: https://help.marinetraffic.com/hc/en-us/articles/203990998-What-is-the-significance-of-the-AIS-Navigational-Status-Values-#'
The additional information that can be joined in relates to the vessel and voyage. The vessel attributes can be specified with vessel_attr. These default to
vessel_attr = c("VesselType","Length")
but can be changed, and other fields may be included ("IMO", "CallSign", "Name", "Width", "DimensionComponents"). The voyage attributes voyage_attr default to just including destination,
voyage_attr = c("Destination")
but also allows other variables ("VariableID", "Cargo", "Draught", "ETA", "StartTime", "EndTime").
The downsample_ais() function also takes the 'every_minutes' argument, which finds observations from each vessel that are closest to that time. For example, if we wanted observations nearest to every 30 minutes, we'd use
downsample_ais(df = subset, every_minutes = 30)
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
#> Joining, by = "MMSI"
And now we have a few hundered or thousand records per tab-delimitted output files in the 'filtered' folder,
dir("filtered")
#> [1] "Zone3_12_2009.txt" "Zone3_12_2010.txt" "Zone3_12_2011.txt"
#> [4] "Zone3_12_2012.txt" "Zone3_12_2013.txt" "Zone3_12_2014.txt"
Using the 30-minute sampling of zone 3 for example, each year has < 10000 records.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.