lag_column | R Documentation |
This function generates a lagged version of a given column based on a date variable, with the
ability to specify a range of lags. It also allows for the optional removal of NA
values.
lag_column(column, date, lag, max_lag = lag, drop_na = TRUE)
column |
A numeric vector or column to be lagged. |
date |
A vector representing dates corresponding to the |
lag |
An integer specifying the minimum lag (in days, hours, etc.) to apply to |
max_lag |
An integer specifying the maximum lag (in days, hours, etc.) to apply to |
drop_na |
A logical value indicating whether to drop |
A vector of the same length as column
, containing the lagged values.
If no matching dates are found within the lag window, NA
is returned for that position.
# Basic example with a vector
dates <- as.Date("2023-01-01") + 0:9
values <- rnorm(10)
lagged_values <- lag_column(values, dates, lag = 1, max_lag = 3)
# Example using a tibble and dplyr::group_by
data <- tibble::tibble(
permno = rep(1:2, each = 10),
date = rep(seq.Date(as.Date('2023-01-01'), by = "month", length.out = 10), 2),
size = runif(20, 100, 200),
bm = runif(20, 0.5, 1.5)
)
data |>
dplyr::group_by(permno) |>
dplyr::mutate(
across(c(size, bm),
\(x) lag_column(x, date, months(3), months(6), drop_na = TRUE))
) |>
dplyr::ungroup()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.