getIndex: Get an Index of Available Argo Float Profiles

Description Usage Arguments Details Value Author(s) References

View source: R/get.R

Description

This function gets an index of available Argo float profiles, typically for later use as the first argument to getProfiles(). The work is done either by downloading information from a data repository or by reusing an existing index (packaged within an .rda file) that is controlled by the age argument behind the scenes.

Usage

1
2
3
4
5
6
7
8
9
getIndex(
  filename = "core",
  server = argoDefaultServer(),
  destdir = argoDefaultDestdir(),
  age = argoDefaultIndexAge(),
  quiet = FALSE,
  keep = FALSE,
  debug = 0
)

Arguments

filename

character value that indicates the file name on the server, as in the first column of the table given in “Details”, or (for some file types) as in the nickname given in the middle column. Note that the downloaded file name will be based on the full file name given as this argument, and that nicknames are expanded to the full filenames before saving.

server

character value, or vector of character values, indicating the name of servers that supply argo data. If more than one value is given, then these are tried sequentially until one is found to supply the index file named in the filename argument. As of December 2020, the three servers known to work are "https://data-argo.ifremer.fr", "ftp://ftp.ifremer.fr/ifremer/argo" and "ftp://usgodae.org/pub/outgoing/argo". These may be referred to with nicknames "ifremer-https", "ifremer"and "usgodae". Any URL that can be used in curl::curl_download() is a valid value provided that the file structure is identical to the mirrors listed above. See argoDefaultServer() for how to provide a default value.

destdir

character value indicating the directory in which to store downloaded files. The default value is to compute this using argoDefaultDestdir(), which returns ~/data/argo by default, although it also provides ways to set other values using options(). Set destdir=NULL if destfile is a filename with full path information. File clutter is reduced by creating a top-level directory called data, with subdirectories for various file types; see “Examples”.

age

numeric value indicating how old a downloaded file must be (in days), for it to be considered out-of-date. The default, argoDefaultIndexAge(), limits downloads to once per day, as a way to avoid slowing down a workflow with a download that might take a minute or so. Note that setting age=0 will force a new download, regardless of the age of the local file, and that age is changed to 0 if keep is TRUE.

quiet

logical value indicating whether to silence some progress indicators. The default is to show such indicators.

keep

logical value indicating whether to retain the raw index file as downloaded from the server. This is FALSE by default, indicating that the raw index file is to be discarded once it has been analyzed Note that if keep is TRUE, then the supplied value of age is converted to 0, to force a new download.

debug

integer value indicating level of debugging. If this is less than 1, no debugging is done. Otherwise, some functions will print debugging information. If a function call fails, the first step should be to rerun the function with debug=1, to see if the output suggests a problem in the call.

Details

The first step is to construct a URL for downloading, based on the url and file arguments. That URL will be a string ending in .gz, or .txt and from this the name of a local file is constructed by changing the suffix to .rda and prepending the file directory specified by destdir. If an .rda file of that name already exists, and is less than age days old, then no downloading takes place. This caching procedure is a way to save time, because the download can take from a minute to an hour, depending on the bandwidth of the connection to the server.

The resultant .rda file, which is named in the return value of this function, holds a list named index that holds following elements:

Some expertise is required in deciding on the value for the file argument to getIndex(). As of June 2020, the FTP sites ftp://usgodae.org/pub/outgoing/argo and ftp://ftp.ifremer.fr/ifremer/argo contain multiple index files, as listed in the left-hand column of the following table. The middle column lists nicknames for some of the files. These can be provided as the file argument, as alternatives to the full names. The right-hand column describes the file contents. Note that the servers also provide files with names similar to those given in the table, but ending in .txt. These are uncompressed equivalents of the .gz files that offer no advantage and take longer to download, so getIndex() is not designed to work with them.

File Name Nickname Contents
ar_greylist.txt - Suspicious/malfunctioning floats
ar_index_global_meta.txt.gz - Metadata files
ar_index_global_prof.txt.gz "argo" or "core" Argo data
ar_index_global_tech.txt.gz - Technical files
ar_index_global_traj.txt.gz "traj" Trajectory files
argo_bio-profile_index.txt.gz "bgc" or "bgcargo" Biogeochemical Argo data (without S or T)
argo_bio-traj_index.txt.gz "bio-traj" Bio-trajectory files
argo_synthetic-profile_index.txt.gz "synthetic" Synthetic data, successor to "merge"

Note: as of Dec 01,2020 the user will no longer have the option to use "argo" as a filename argument. Instead, "core" will be used.

The next step after using getIndex() is usually to use getProfiles(), which downloads or checks for local copies of the per-profile data files that are listed in an index, and this is typically followed by a call to readProfiles(), which reads the local files, yielding an object that can be plotted or analyzed in other ways. For more on this function, see section 2 of Kelley et al. (2021).

Value

An object of class argoFloats with type="index", which is suitable as the first argument of getProfiles().

Author(s)

Dan Kelley

References

Kelley, D. E., Harbin, J., & Richards, C. (2021). argoFloats: An R package for analyzing Argo data. Frontiers in Marine Science, (8), 636922. doi: 10.3389/fmars.2021.635922


dankelley/argoFloats documentation built on Oct. 19, 2021, 1:17 p.m.