match_ts_nearest_by_key: Match time series by key and time
In edwardlavender/Tools4ETS: Tools for Ecological Time Series

match_ts_nearest_by_key

R Documentation

Match time series by key and time

Description

For two dataframes, d1 and d2, this function finds the positions in the second dataframe which, for each key (e.g., factor level) in the first dataframe, are nearest in time (i.e., nearest neighbour interpolation accounting for observations from different factor levels).

Usage

match_ts_nearest_by_key(d1, d2, key_col, time_col)

Arguments

`d1`	A dataframe which includes a column that defines factor levels and a column that defines time stamps. The names of these columns need to match those in `d2`.
`d2`	A dataframe which includes a column that defines factor levels and a column that defines time stamps. The names of these columns need to match those in `d1`.
`key_col`	A character that defines the column name in `d1` and `d2` that distinguishes factor levels.
`time_col`	A character that defines the column name in `d1` and `d2` that defines time stamps.

Details

If there are multiple matches, only the first is returned.

Value

For a dataframe comprising observations from a series of factor levels (e.g., individuals) collected through time, the function returns a vector of positions in a second dataframe which, for the appropriate factor level, are nearest in time.

Author(s)

Edward Lavender

Examples

#### Example (1)
# Imagine we have observations from two keys (e.g., individuals) in two dataframes
# We want to add observations from the second dataframe into the first dataframe.
# Accounting for keys, the observations nearest in time in d2 for each row in d1 are
# ... 1, 2, 4, 4
d1 <- data.frame(t = as.POSIXct(c("2016-01-01 12:00:00",
                                  "2016-01-01 15:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 16:00:00")),
                 key = c(1, 1, 2, 2))
d2 <- data.frame(t = as.POSIXct(c("2016-01-01 13:00:00",
                                  "2016-01-01 14:00:00",
                                  "2016-01-01 12:00:00",
                                  "2016-01-01 15:00:00")),
                 key = c(1, 1, 2, 2))
match_ts_nearest_by_key(d1, d2, key_col = "key", time_col = "t")

#### Example (2)
# Define dataframes
d1 <- data.frame(t = as.POSIXct(c("2016-01-01 18:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 13:00:00",
                                  "2016-01-01 14:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 21:00:00")),
                 key = c(2, 2, 2, 1, 1, 3))
d2 <- data.frame(t = as.POSIXct(c("2016-01-01 21:00:00",
                                  "2016-01-01 14:00:00",
                                  "2016-01-01 18:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 22:00:00",
                                  "2016-01-01 20:00:00",
                                  "2016-01-01 13:00:00",
                                  "2016-01-01 17:00:00",
                                  "2016-01-01 16:00:00")),
                 key = c(2, 2, 2, 2, 2, 3, 3, 1, 1),
                 vals = stats::runif(9, 0, 1))
# Add the to the dataframe
d1$position_in_d2 <- match_ts_nearest_by_key(d1, d2, key_col = "key", time_col = "t")
# Show that the index adds the correct key
d1$key_in_d2 <- d2$key[d1$position_in_d2]
# Show that the index adds the correct time stamp for that key
d1$t_in_d2 <- d2$t[d1$position_in_d2]
# We can now safely add values from d2 to d1:
d1$val_in_d2 <- d2$vals[d1$position_in_d2]
# Examine d1 and d2:
d1; d2

edwardlavender/Tools4ETS documentation built on Nov. 29, 2022, 7:41 a.m.