pair_ts: Pair time series

View source: R/match_ts.R

pair_tsR Documentation

Pair time series

Description

This function adds observations from one time series to another time series using a matching process (e.g., nearest neighbour interpolation). This is useful when you have a main dataframe to which you need to add observations (e.g., those occurring closest in time) from another dataframe.

Usage

pair_ts(
  d1,
  d2,
  time_col,
  key_col = NULL,
  val_col,
  method = "match_ts_nearest",
  min_gap = NULL,
  max_gap = min_gap,
  units = "mins",
  control_beyond_gap = NULL
)

Arguments

d1

A dataframe that contains, at a minimum, a vector of time stamps, to which observations need to be added from d2.

d2

A dataframe that contains, at a minimum, a vector of time stamps and associated observations, to be added to d1.

time_col

A character that defines the name of the column that contains time stamps in d1 and d2.

key_col

(optional) A character that defines the name of the column that contains keys in d1 and d2. This is required for method = "match_ts_nearest_by_key" (see below).

val_col

A character that defines the name of the column that contains observations in d2.

method

A character that defines the matching method. The options currently implemented are "match_ts_nearest", which implements match_ts_nearest and "match_ts_nearest_by_key" which implements match_ts_nearest_by_key.

min_gap

(optional) A number that defines the minimum time gap (in user-defined units, see units, below) between times in d1 and the times of observations that are added to d1 from d2. This is useful if, for instance, some of the nearest observations in d2 occurred long before the nearest observations in d1. If provided, the function counts the number of observations which do not meet this requirement and, if requested via control_beyond_gap, removes these from the returned dataframe or sets them to NA (see below).

max_gap

As above, for min_gap, but the maximum time gap.

units

A character that defines the units of the inputted min_gap or max_gap. This is passed to difftime.

control_beyond_gap

A character that defines whether or not to set rows from d1 that contain observations from d2 that exceed min_gap or max_gap to NA ("NA") or to remove those rows ("remove").

Value

The function returns a dataframe, d1, as inputted, with an added column (whose name is given by val_col), comprising values added from another dataframe, d2. Any observations in d1 for which there are not observations in d2 occurring within some time window (defined by min_gap and max_gap), if specified, are counted and, if requested, removed from the returned dataframe.

Author(s)

Edward Lavender

Examples

#### Example (1) Pair time series using method = "match_nearest_ts()"
# Define dataframe to which we want to add information
d1 <- data.frame(t = seq.POSIXt(as.POSIXct("2016-01-01"), as.POSIXct("2016-01-02"), by = "hours"))
# Define dataframe in which information is contained
d2 <- data.frame(t = seq.POSIXt(as.POSIXct("2016-01-01"), as.POSIXct("2016-01-02"), by = "mins"))
d2$vals <- runif(nrow(d2), 0, 50)
pair_ts(d1, d2, time_col = "t", val_col = "vals", method = "match_ts_nearest")

#### Example (2) Pair time series sing method = "match_nearest_ts_by_key()"
# Define dataframes
d1 <- data.frame(t = as.POSIXct(c("2016-01-01 18:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 13:00:00",
                                  "2016-01-01 14:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 21:00:00")),
                 key = c(2, 2, 2, 1, 1, 3))
d2 <- data.frame(t = as.POSIXct(c("2016-01-01 21:00:00",
                                  "2016-01-01 14:00:00",
                                  "2016-01-01 18:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 22:00:00",
                                  "2016-01-01 20:00:00",
                                  "2016-01-01 13:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 16:00:00")),
                 key = c(2, 2, 2, 2, 2, 3, 3, 1, 1),
                 vals = stats::runif(9, 0, 1))
pair_ts(d1, d2,
        time_col = "t", key_col = "key", val_col = "vals",
        method = "match_ts_nearest_by_key")

#### Example (3) Flag observations that exceed a min/max gap
pair_ts(d1, d2,
        time_col = "t", key_col = "key", val_col = "vals",
        method = "match_ts_nearest_by_key",
        min_gap = 0,
        max_gap = 1,
        control_beyond_gap = "remove")


edwardlavender/Tools4ETS documentation built on Nov. 29, 2022, 7:41 a.m.