Oceanographic field experiments often employ a suite of instrument types, each
reporting data in a different format. Many of these formats are complex and
difficult to decode. Although manufacturers usually provide software for
accessing data produced by their instruments, it tends to be proprietary and
closed-source. This is a problem for researchers seeking to analyse their data
in novel ways, or combine data from multiple instruments. The oce
package
[@kelley__aut_oce_2021] addresses such issues by providing functions that
handle dozens of data formats. In addition, it facilitates specialized
calculations and data displays that are particular to the discipline. Since
oce
is written in the R language
[@ihaka_r_1996;@r_core_team_introduction_2021], it forms a link to an array of
more general tools that oceanographers may need in their work
[@kelley_oceanographic_2018].
The oce
package has been hosted on CRAN [@noauthor_comprehensive_2021] since
the year 2009. The CRAN version, which is updated once or twice a year, may be
installed by typing install.packages("oce")
in an R console. Users who need
newer features may use remotes::install_github("dankelley/oce",ref="develop")
to download and build the development branch. Those wishing to view or
participate in the development process are welcome to do so, at
\url{https://github.com/dankelley/oce}.
The package has functions for decoding many data formats. These functions
return S4 objects with slots holding (a) the data, (b) related metadata, and
(c) a log of oce
functions that made the object. This is illustrated by
executing the following in an R session, for a built-in object creating by
reading a profiling instrument called a CTD.
library(oce) options(width=70) knitr::opts_chunk$set(fig.path="", dev="png", dpi=300, pointsize=8) data(ctd)
library(oce) # load library data(ctd) # load a built-in sample file slotNames(ctd) # see 'slot' names
The next step after loading an object, or reading it from a data file, is often
to get a textual overview with summary(ctd)
, or a graphical overview, e.g.
with plot(ctd)
producing Figure 1. It is also common to exert fine-grained
control of graphical representations, with e.g. plot(ctd,
which="temperature")
to plot just the temperature variation with depth
(results not shown here). The variations of other properties may be shown by
setting which
appropriately, and this argument can also be used to specify
other types of plots, in addition to the depth-variation form.
plot(ctd)
Besides this "ctd"
subclass, oce
supports dozens of other subclasses that
cover a wide range of oceanographic instrumentation. In every case, the same
"summary()"
and "plot()"
function calls provide textual and graphical
representations of the data. This specialization of these two generic
functions simplifies analysis considerably. For example, if PATTERN
is a
regular expression that specifies a set of data files, whether of a single
instrument type or multiple instrument types, then
for (file in list.files(PATTERN)) { d <- read.oce(file) summary(d) plot(d) }
will provide information about each data file of interest, forming a good first stage of analysis.
Oce also provides other generic functions, including subset()
for focusing on
subsets of data, handleFlags()
for processing data-quality flags, and [[
for accessing data. The last of these is particularly worthy of note, for two
reasons.
[[
finds information regardless of where it is stored in the object. For
example, a CTD does not measure longitude and latitude, but if these things
are known, they are stored in the metadata
slot, not the data
slot.
Other objects might have longitude in the data
slot. This detail is
immaterial to users, because [[
looks in both slots. Therefore, code
written for one object type will often work for another type.
[[
can access not just information stored within the object, but also
things that can be calculated from that information. For example, CTD files
typically hold information which seawater density may be computed
[@millero_history_2010;@mcdougall_getting_2011], and so [[
is set up to
compute it, if requested. This same scheme works for other computable
elements.
The [[
function acts as a sort of bridge from the oceanographic realm to the
general R realm, with its thousands of useful and well-vetted packages. This
reduces the need to create new tools, letting analysts focus on oceanography,
not coding.
A more detailed example may help to solidify some of the key aspects of oce
.
Many readers will have an interest in tides, so we will work with a year-long
record of sea level, $\eta=\eta(t)$ in Halifax Harbour, in the year 2003,
during which the city was struck by Hurricane Juan.
Consider the code given below, which produces Figure 2. A built-in sealevel
file is used, to make a reproducible example, but replacing the data()
call
with a read.sealevel()
call will handle data files in standard formats. Note
that the tidem()
function is fairly sophisticated with over 500 lines of R
code being used to apply the specialized procedures of tidal analysis
[@godin_analysis_1972;@pawlowicz_classical_2002;@foreman_versatile_2009].
Readers who see that the function evokes the lm()
function for linear models,
may not be surprised that oce
provides a function named predict()
, for
generating tidal predictions.
library(oce) # load library data(sealevel) # use built-in example dataset t <- sealevel[["time"]] # extract time eta <- sealevel[["elevation"]] # extract sea level m <- tidem(sealevel) # fit tidal model etaDetided <- eta - predict(m) # de-tide observations par(mfrow=c(2, 1)) # set up a two-panel plot oce.plot.ts(t, eta, xaxs="i", # top: observed sea level grid=TRUE, ylab="Sea level [m]") oce.plot.ts(t, etaDetided, xaxs="i", # bottom: de-tided sea level grid=TRUE, ylab="De-tided sea level [m]")
A comparison of the panels in Figure 2 reveals that tides explain much of the
sea level variation in Halifax Harbour. The lower panel illustrates an
increase of detided variance during the winter months, as expected at a
northern mid-latitude. More surprising is the large spike towards the end of
September. This is a result of Hurricane Juan, which swept over Halifax at
that time, causing a storm surge of approximately 1.5m that, along with high
waves, caused major damage in the harbour [@xu_extreme_2012]. (Readers might
find it informative to supply an xlim
argument to the plot calls, to narrow
in on the event.)
The oce
package provides for many aspects of oceanographic analysis, having
evolved in an open-source environment for more than a decade. The developers
have benefited from a supportive user community, members of which have
contributed insightful bug reports and suggestions for improvements. New
features are added continually, to handle new instrument types, new data
repositories, and new methods. Physical oceanography is a major focus of the
package, but we hope this paper will generate interest in other communities,
ranging from climatologists to those in marine disciplines such as chemistry
and biology. Our other goal is to encourage the development of new R packages,
such as argoFloats
[@kelley_argofloats_2021], that build upon oce
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.