knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
Abstract. This vignette explains the basics of using the oce package to handle measurements made by both mooring and ship based acoustic Doppler profilers. As in the related vignette about working with data-quality flags, the material is organized to reflect typical workflow. See the main oce vignette for general information about oce. The sample code provided here relies on the oce package, loaded as follows.
library(oce)
Acoustic Doppler profiling instruments are called ADCPs by some manufacturers
and ADPs by others. The class name adp
is used in oce, to handle both types.
It is important to realize that there are many data formats for data of this
type, and that the files are normally in a dense binary format that cannot be
read with a text editor. It is necessary to use software to work with such
files. Although oce can handle many file formats, and although its authors do
their best to keep up with changes to file formats (at least for popular
instruments), users will sometimes be forced to use manufacturer software to
read the data, or at least convert to a format that can be manually read into
R, and then converted to an adp
object using the as.adp()
function.
Suppose f
is a character string naming a data file. In many cases,
d <- read.oce(f)
will read the file, because read.oce
will look inside the file to try to
guess the format. If this does not work, one ought to try e.g.
d <- read.adp(f)
which, again, will try to determine which subtype of adp file has been
provided. If that fails, it might help to try a specialized function, such as
read.adp.rdi()
for an RDI file, read.adp.nortek()
for a Nortek file, or
read.adp.sontek()
for a Sontek file. (Actually, each of these can handle a
variety of sub-formats).
If any of these read.
functions is called with just a single argument naming
the file, then the entire dataset will be read in. This can be slow for large
files (e.g. taking of order 0.1s per Mb), and so it can help to use the by
argument, e.g. for a 138Mb file on the author's computer,
f <- "/data/archive/sleiwex/2008/moorings/m09/adp/rdi_2615/raw/adp_rdi_2615.000" dall <- read.oce(f)
takes 15.2s of user time, but
d100 <- read.oce(f, by = 100)
takes 0.2s. It is also possible to window the data, using the from
and to
arguments, which (like by
) can be integers that count ensembles (or "profiles")
or a POSIX time (see ?read.adp
). For example, the
adp dataset provided by oce was read with
read.oce(f, from = as.POSIXct("2008-06-26", tz = "UTC"), to = as.POSIXct("2008-06-27", tz = "UTC"), by = "60:00", latitude = 47.88126, longitude = -69.73433 )
with f
as defined above. This also illustrates that it is possibly to specify
the location of an instrument; the further processing that was done to create
data(adp)
will be explained in due course.
+Note: Users who are not familiar with ADP data may noticed that there is a combination of many file types when dealing with ADP data. Some of these include: .ENR, which are raw data only received from the ADP, .ENS, similar to .ENR but also contain data merged from other file types without processing, .ENX, merged data that are processed including calculations to transformed data including east, north, and vertical velocities, .STA, which are a short term average of measurements, and .LTA, which are a long term average of measurements.
The generic function summary
provides a useful overview of an adp data
object. For example,
data(adp) summary(adp)
(results not shown \ldots the goal is to encourage the user to try this!)
shows the sort of information normally present in an RDI file. Other
manufacturer types will have somewhat different summaries, because both the
metadata and the data differ from instrument to instrument. In all cases, there
will be a vector holding time (named time
), another holding the distance from
instrument to the centre of a measurement bin, measured along a line
equidistant between the beams (named distance
), along with an array
holding velocity (named v
). (Other arrays may include, as in this RDI case,
q
for data quality, a
for backscatter amplitude, and g
for percentage
goodness.)
Note that the summary reveals that instrument had been set up to record in
beam
coordinates, but that processing had been done to convert to enu
(east-north-up) coordinates.
Exercise 1. Determine the name of the metadata
item that holds the
coordinate system of an adp object.
After completing Exercise 1, the user will notice that both latitude
and
longitude
are available in the metadata for this built-in, moored, adp
dataset. It should be noted that if this dataset was a ship-based adp dataset,
then latitude
and longitude
would not be included. Instead, the data would
include firstLatitude
, firstLongitude
, lastLatitude
, and lastLongitude
,
which provide an indication of the ship drift during sampling.
It is critical for analysts to be aware of the coordinate system of an adp
dataset, simply because those systems are so different. In what oce calls the
beam
system, velocities in each of the acoustic beams are measured along
the slant angle of the beam. These data may be combined (using a
transformation matrix which is part of the metadata displayed with the
summary()
function) into what oce calls a xyz
system that is Cartesian and
fixed to the instrument. Since instruments record heading, pitch, and roll, the
xyz
velocities can be converted easily to enu
velocities.
In some applications, the scientific focus is primarily on the enu
data, but
it is a good idea to calculate and study the data at all stages of processing.
Thus, although the toEnu()
function will convert data measured in any
coordinate system into an enu
variant, many analysts will do something along
the lines of
beam <- read.oce(f) xyx <- beamToXyz(beam) enu <- xyzToEnu(xyz, declination = -18.1)
where the declination illustrated is the value used in creating the data(adp)
dataset. (Compensating for declination is necessary because compasses are
calibrated at a particular place and time, and compasses have no way of
"knowing" the local angle between true north and magnetic north.)
The generic plot()
function is specialized to handle adp objects in a variety
of simple but helpful ways. The type of plot is determined by the which
argument. The default for which
is to create a multi-panel image plot that
shows time-space dependence for each velocity component. The number of panels
depends on the number of beams used by the instrument, and each one has a label
indicating the component name. For example,
#| fig.alt: > #| Default plot of an adp object. plot(adp)
has velocity panels named east
, north
and up
, along with one
named error
, which is an estimate of the velocity-inference error.
Options to this plot variant can be used to control how distance from instrument is shown (which can be useful in visualizing data from upward- and downward-aligned instruments), what colours are used to represent the signals, etc.
Dozens of other plot types are also provided; see ?"plot,adp-method"
for
details -- but be warned that a complex plot requires many arguments!
When dealing with ship-based adp data, in order to get the true east
,
north
, and up
velocity, the user must remove the speed of the ship. To do
this, use the subtractBottomVelocity()
function, which subtracts the bottom
tracking velocities from an adp object.
Exercise 2. Create a u-v scatterplot.
Exercise 3. Write the steps to plot the east
, north
, and up
velocity
without the speed of the ship for a file named
COR2019002_20190818T064815_007_000000.ENS
.
Adp data can be subsetted in a wide variety of ways, e.g.
#| fig.alt: > #| Plot of a subset of an adp object. plot(subset(adp, distance < 20))
plots only data within 20m of the instrument. See ?"subset,adp-method"
for
more on subsetting.
Exercise 4. Plot the adp data for the first half of the sampling interval.
All oce objects share a common framework, with so-called "slots" for
metadata, data, and processing log. It is possible to examine, and modify, the
elements of each of these slots using base R constructs, but it is usually
better to use oce accessor functions. The details are provided with
?"[[,adp-method"
and ?"[[<-,adp-method"
, but a few examples should suffice
to sketch the general pattern.
For example,
time <- adp[["time"]] distance <- adp[["distance"]]
extracts the times and distances of the profiles within data(adp)
, while the
array holding the velocities is extract with
v <- adp[["v"]]
Altering values uses the [[
to the left of the assignment operator, but this
is not commonly required in simple work, so it is best for readers to focus
first on accessing data. The exercises may help with this.
Some of the fields contained in adp objects, particularly the amplitude,
correlation, and percent good fields (a
, q
, and g
) are stored as
raw
-type. This is done primarily to save memory space, as expanded all of
those fields into numeric
types can be slow and they get large quickly.
However, to actually work with the numbers, users can extract a "numeric"
version by supplying the second argument to the [[
function, e.g.
a <- adp[["a", "numeric"]]
This has the advantage of returning an object with the same matrix dimensions
as the original (note that functions like as.numeric()
typically discard
dimension information, thereby converting an array to a vector).
Exercise 5. Plot a time series of mid-depth and depth-averaged eastward velocities.
Exercise 6. Perform an eigen-analysis of eastward and northward velocity.
Exercise 7. Plot a time series of pressure variations, to reveal tidal height.
Exercise 1. Determine the name of the metadata
item that holds the
coordinate system of an adp object.
Solution.
With adp as loaded in the text, we may the names in the metadata with:
sort(names(adp[["metadata"]]))
(results not shown), after which a keen eye will notice that two elements relate to coordinates:
adp[["originalCoordinate"]] adp[["oceCoordinate"]]
and comparison with the output of summary(adp)
reveals that the former is the
coordinate system in which the data were originally measured, while the
latter is the coordinate system of the transformed data. Using
processingLogShow(adp)
reveals the steps in the data transformation.
Exercise 2. Create a u-v scatterplot.
Solution.
#| fig.alt: > #| A u-v scatterplot. plot(adp, which = "uv")
Exercise 3. Write the steps to plot the east
, north
, and up
velocity
without the speed of the ship for a file named
COR2019002_20190818T064815_007_000000.ENS
.
Solution.
#| fig.alt: > #| Plot velocities after removing ship motion. library(oce) adcp <- read.adp("COR2019002_20190818T064815_007_000000.ENS") enu <- toEnu(adcp) removeShipSpeed <- subtractBottomVelocity(enu) plot(removeShipSpeed, which = 1:3)
Exercise 4. Plot the adp data for the first half of the sampling interval.
Solution. (results not shown)
#| fig.alt: > #| Plot just the first part of an adp record. plot(subset(adp, time < median(adp[["time"]])))
Exercise 5. Plot a time series of mid-depth and depth-averaged eastward velocities.
Solution.
#| fig.alt: > #| Plot mid-depth and depth-averaged eastward velocities. time <- adp[["time"]] v <- adp[["v"]] # The second index is for bin number, the third for beam number midIndex <- dim(v)[2] / 2 eastMid <- v[, midIndex, 1] # third index is beam distance <- adp[["distance"]][midIndex] oce.plot.ts(time, eastMid, ylab = "Eastward velocity [m/s]") # Depth mean; note that na.rm, is passed by apply() to mean() eastMean <- apply(v[, , 1], 1, mean, na.rm = TRUE) lines(time, eastMean, col = 2)
Here, oce.plot.ts()
is used instead of the basic R function plot()
, because
it provides a margin note on the detailed time interval.
Exercise 6. Perform an eigen-analysis of eastward and northward velocity.
Solution.
u <- adp[["v"]][, , 1] v <- adp[["v"]][, , 2] ok <- is.finite(u) & is.finite(v) # remove NA values u <- u[ok] v <- v[ok] eigen(cov(data.frame(u, v)))
Note that this action is equivalent to principal component analysis, and it might be wiser to use
pr <- prcomp(data.frame(u, v))
to do the analysis, which yields the same principal axes
pr
but has the advantage of having associated methods, including for plotting.
Exercise 7. Plot a time series of pressure variations, to reveal tidal height.
Solution.
#| fig.alt: > #| Plot pressure time series. time <- adp[["time"]] pressure <- adp[["pressure"]] oce.plot.ts(time, pressure)
As a matter of interest, a tidal analysis may be done with
m <- tidem(as.sealevel(pressure, time))
where the list of undetermined tidal constituents is long, because this tidal
record is so short (see ?tidem
). The fitted constituents are as follows
summary(m)
(Note that it fitted for M2, but not S2, because the Rayleigh criterion prevents inferring both, and the conventional tidal analysis favours M2 over S2 if the time series is too short to fit for more than one semi-diurnal constituent.)
A useful step might be to draw a tidal fit along with the data. It makes sense
to construct a finer time grid, to get a smooth curve, as in the following (see
?predict.tidem
).
#| fig.alt: > #| Plot tidal prediction. oce.plot.ts(time, pressure, type = "p", col = "blue") timePredict <- seq(min(time), max(time), length.out = 200) pressurePredict <- predict(m, timePredict) lines(timePredict, pressurePredict, col = "red")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.