match_ts_nearest: Find the position in one vector that is nearest in time to a...

View source: R/match_ts.R

match_ts_nearestR Documentation

Find the position in one vector that is nearest in time to a value in another

Description

This function is like match, but the aim is, for a given sequence of times (times), to find the positions in another sequence of times (lookup) that are nearest in time to those in the first sequence. This is useful if, for example, you have an existing dataframe to which you want to add the observations, held in another dataframe, that are nearest in time to observations in the first dataframe (i.e., nearest neighbour interpolation). This function uses data.table for fast matching, even with very large vectors.

Usage

match_ts_nearest(times, lookup)

Arguments

times

A vector of time stamps for which you want to identify the position of the nearest time stamp in another vector (lookup).

lookup

A vector of time stamps for which you will determine the position of the nearest time stamp to each time in times.

Details

If there are multiple matches, only the first is returned.

Value

For a sequence of times (times), the function returns a vector of the positions of the nearest times in another sequence (lookup).

Author(s)

Edward Lavender

See Also

match_ts_nearest_by_key is an extension of this function to account for different factor levels when these to be included in the matching process. To use match_ts_nearest or match_ts_nearest_by_key to add observations from one dataframe to another, see pair_ts.

Examples


#### Define example data (1)
# Define dataframe to which we want to add information
d1 <- data.frame(t = seq.POSIXt(as.POSIXct("2016-01-01"), as.POSIXct("2016-01-02"), by = "hours"))
# Define dataframe in which information is contained
d2 <- data.frame(t = seq.POSIXt(as.POSIXct("2016-01-01"), as.POSIXct("2016-01-02"), by = "mins"))
d2$vals <- runif(nrow(d2), 0, 50)

#### Example (1): Given a sequence of times, identify the positions of the nearest
# ... corresponding observations in another sequence:
# Use match_ts_nearest() to add information to the first dataframe based on second dataframe
d1$position_in_d2 <- match_ts_nearest(times = d1$t, lookup = d2$t)
d1$vals <- d2$vals[d1$position_in_d2]
# Examine
head(cbind(d1, d2[d1$position_in_d2, ]))

#### Example (2): Relative to the times in 'times', the nearest times in lookup may be
# ... before/after a given observation:
t1 <- as.POSIXct(c("2016-01-01 00:00:01", "2016-01-01 00:10:00",
                   "2016-01-01 00:20:01", "2016-01-01 00:30:01"))
t2 <- as.POSIXct(c("2016-01-01 00:00:00", "2016-01-01 00:11:00",
                    "2016-01-01 00:50:01", "2016-01-01 00:22:01"))
# The correct order here is as follows:
# The first observation in t1 is nearest (before) to t2[1]
# The second observation in t2 is nearest (after) t2[2]
# The third observation in t1 is nearest (before) t2[4]
# The fourth observation in t2 is nearest (before) t2[4]
# This is what is returned by match_ts_nearest():
match_ts_nearest(t1, t2)

#### Example (3) Input observations ('times' or 'lookup') do not need to be ordered by time:
## Example with 'times' unordered:
t1_unordered <- t1[c(2, 1, 4, 3)]
# Manual examination of nearest observations
t1_unordered; t2
# First observation in t1_unordered is nearest to t2[2]
# Second observation in t1_unordered is nearest to t2[1]
# Third observation in t1_unordered is nearest to t2[4]
# Fourth observation in t1_unordered is nearest to t2[4]
# Implement match_ts_nearest():
match_ts_nearest(t1_unordered, t2)
## Example with 'lookup' unordered
t2_unordered <- t2[c(3, 1, 2, 4)]
t1; t2_unordered
# Correct order is 2, 3, 4, 4
match_ts_nearest(t1, t2_unordered)
## Example with both 'times' and 'lookup' unordered
t1_unordered; t2_unordered
# correct output is: 3, 2, 4, 4
match_ts_nearest(t1_unordered, t2_unordered)


edwardlavender/Tools4ETS documentation built on Nov. 29, 2022, 7:41 a.m.