read_data: Load seismic data from an archive

View source: R/read_data.R

read_dataR Documentation

Load seismic data from an archive

Description

The function loads seismic data from a data directory structure (see aux_organisecubefiles) based on the event start time, duration, component and station ID. The data to be read needs to be adequately structured. The data directory must contain mseed or SAC files. These files will either be identified automatically or can be defined explicitly by the parameter format.

Usage

read_data(
  start,
  duration,
  station,
  component = "BHZ",
  format,
  dir,
  pattern = "eseis",
  simplify = TRUE,
  interpolate = FALSE,
  eseis = TRUE,
  try = TRUE,
  ...
)

Arguments

start

POSIXct value, start time of the data to import. If lazy users only submit a text string instead of a POSIXct object, the function will try to convert that text string, assuming UTC as time zone.

duration

Numeric value, duration of the data to import, in seconds.

station

Character value, seismic station ID, which must correspond to the ID in the file name of the data directory structure (cf. aux_organisecubefiles).

component

Character value, seismic component, which must correspond to the component name in the file name of the data directory structure (cf. aux_organisecubefiles). Default is "BHZ" (vertical component of a sac file).

format

Character value, seismic data format. One out of "sac" and "mseed". If omitted, the function will try to identify the right format automatically.

dir

Character value, path to the seismic data directory. See details for further info on data structure.

pattern

Character value, either keyword or pattern string with wildcards, describing the data organisation. Supported keywords are "eseis" and "seiscomp". See details for keyword definition and format of pattern strings. Default option is eseis.

simplify

Logical value, option to simplify output when possible. This basically means that if only data from one station is loaded, the list object will have one level less. Default is TRUE.

interpolate

Logical value, option to interpolate possible gaps in the resulting data stream. If enabled, NA values will be identified and linearly interpolated using the function signal_fill. Default is FALSE, i.e. NA gaps will remain in the imported data set.

eseis

Logical value, option to read data to an eseis object (recommended, see documentation of aux_initiateeseis), default is TRUE

try

Logical value, option to run the function in try-mode, i.e., to let it return NA in case an error occurs during data import. Default is FALSE.

...

Further arguments to describe data structure, only needed for pattern type seiscomp. These arguments can be one or more of the following: "network", "type", "location". If omitted, the function will identify all files in the SeisComP data archive that fulfill the criteria. If other than data files (type = "D") or files from another network are in the archive, these may lead to crashes of the function.

Details

Data organisation must follow a consistent scheme. The default scheme, eseis (Dietze, 2018 ESurf) requires hourly files organised in a directory for each Julian Day, and in each calendar year. The file name must be entirely composed of station ID, 2-digit year, Julian Day, hour, minute, second and channel name. Each item must be separated by a full stop, e.g. "2013/203/IGB01.13.203.16.00.00.BHZ" for a file from 2013, Julian Day 203, from station IGB01, covering one hour from "16:00:00 UTC", and containing the BHZ component. Each Julian Day directory can contain files from different components and stations. The respective pattern string to describe that file organisation is "%Y/%j/%STA.%y.%j.%H.%M.%S.%CMP". The percent sign indicates a wild card, where %Y is the 4-digit year, %j the 3-digit Julian Julian Day, %STA the station ID, %y the 2-digit year, %H the 2-digit hour, %M the 2-digit minute, %S the 2-digit second and %CMP the component ID. The files can have a further file extension which does not need to be explicitly defined in the pattern string. The slashes in the above pattern string define subdirectories.

An alternative organisation scheme is the one used by SeisComP, indicated by the keyword "seiscomp" or the pattern string "%Y/%NET/%STA/%CMP/%NET.%STA.%LOC.%CMP.%TYP.%Y.%j". The wild card "NET" means the network ID, "LOC" the location abbreviation and "TYP" the data type. The other wild cards are as defined above. Hence, the SeisComP scheme consists of directories of the calendar year, the network to which the data belongs, the station it has been recorded by, and the component it belongs to. The files in that latter directory must be daily files.

Value

A list object containing either a set of eseis objects or a data set with the time vector ($time) and a list of seismic stations ($station_ID) with their seismic signals as data frame ($signal). If simplify = TRUE (the default option) and only one seismic station is provided, the output object containseither just one eseis object or the vectors for $time and $signal.

Author(s)

Michael Dietze

Examples


## set seismic data directory
dir_data <- paste0(system.file("extdata", package="eseis"), "/")

## load the z component data from a station
data <- read_data(start = as.POSIXct(x = "2017-04-09 01:20:00", 
                                        tz = "UTC"), 
                      duration = 120,
                      station = "RUEG1",
                      component = "BHZ",
                      dir = dir_data)
## plot signal
plot_signal(data = data)

## load data from two stations
data <- read_data(start = as.POSIXct(x = "2017-04-09 01:20:00", 
                                     tz = "UTC"), 
                  duration = 120,
                  station = c("RUEG1", "RUEG2"),
                  component = "BHZ",
                  dir = dir_data)

## plot both signals
par(mfcol = c(2, 1))
lapply(X = data, FUN = plot_signal)
                     

coffeemuggler/eseis documentation built on Aug. 19, 2023, 9:57 p.m.