lagged: Create Lagged Variables

View source: R/lagged.R

laggedR Documentation

Create Lagged Variables

Description

This function computes lagged values of variables by a specified number of observations. By default, the function returns lag-1 values of the vector, matrix, or data frame specified in the first argument.

Usage

lagged(..., data = NULL, id = NULL, obs = NULL, day = NULL, lag = 1, time = NULL,
       units = c("secs", "mins", "hours", "days", "weeks"), append = TRUE,
       name = ".lag", name.td = ".td", as.na = NULL, check = TRUE)

Arguments

...

a vector for computing a lagged values for a variable, matrix or data frame for computing lagged values for more than one variable. Note that the subject ID variable (id), observation number variable (obs), day number variable (day), and the date and time variable (time) are excluded from ... when specifying the argument the using the names of the variables. Alternatively, an expression indicating the variable names in data. Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.

data

a data frame when specifying one or more variables in the argument .... Note that the argument is NULL when specifying a vector, matrix, or data frame for the argument ....

id

either a character string indicating the variable name of the subject ID variable in '...' or a vector representing the subject IDs, see 'Details'.

obs

either a character string indicating the variable name of the observation number variable in '...' or a vector representing the observations. Note that duplicated values within the same subject ID are not allowed, see 'Details'.

day

either a character string indicating the variable name of the day number variable in '...' or a vector representing the days, see 'Details'.

lag

a numeric value specifying the lag, e.g. lag = 1 (default) returns lag-1 values.

time

a variable of class POSIXct or POSIXlt representing the date and time of the observation used to compute time differences between observations.

units

a character string indicating the units in which the time difference is represented, i.e., "secs" for seconds, "mins" (default) for minutes, "hours" for hours, "days" for days, and "weeks" for weeks.

append

logical: if TRUE (default), lagged variable(s) are appended to the data frame specified in the argument data.

name

a character string or character vector indicating the names of the lagged variables. By default, lagged variables are named with the ending ".lag" resulting in e.g. "x1.lag" and "x2.lag" when specifying two variables. Variable names can also be specified using a character vector matching the number of variables specified in ..., e.g. name = c("lag.x1", "lag.x2")).

name.td

a character string or character vector indicating the names of the time difference variables when specifying a date and time variables for the argument time. By default, time difference variables are named with the ending ".td" resulting in e.g. "x1.td" and "x2.td" when specifying two variables. Variable names can also be specified using a character vector matching the number of variables specified in ..., e.g. name = c("td.x1", "td.x2")).

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to the argument x, but not to cluster.

check

logical: if TRUE (default), argument specification is checked.

Details

The function is used to create lagged version of the variable(s) specified via the ... argument:

Optional argument id

If the id argument is not specified i.e., id = NULL, all observations are assumed to come from the same subject. If the dataset includes multiple subjects, then this variable needs to be specified so that observations are not lagged across subjects

Optional argument day

If the day argument is not specified i.e., day = NULL, values of the variable to be lagged are allowed to be lagged across days in case there are multiple observation days.

Optional argument obs

If the obs argument is not specified i.e., obs = NULL, consecutive observations from the same subjects are assumed to be one lag apart.

Value

Returns a numeric vector or data frame with the same length or same number of rows as ... containing the lagged variable(s).

Note

This function is a based on the lagvar() function in the esmpack package by Wolfgang Viechtbauer and Mihail Constantin (2023).

Author(s)

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Viechtbauer W, Constantin M (2023). esmpack: Functions that facilitate preparation and management of ESM/EMA data. R package version 0.1-20.

See Also

center, rec, coding, item.reverse.

Examples

dat <- data.frame(subject = rep(1:2, each = 6),
                   day = rep(1:2, each = 3),
                   obs = rep(1:6, times = 2),
                   time = as.POSIXct(c("2024-01-01 09:01:00", "2024-01-01 12:05:00",
                                       "2024-01-01 15:14:00", "2024-01-02 09:03:00",
                                       "2024-01-02 12:21:00", "2024-01-02 15:03:00",
                                       "2024-01-01 09:02:00", "2024-01-01 12:09:00",
                                       "2024-01-01 15:06:00", "2024-01-02 09:02:00",
                                       "2024-01-02 12:15:00", "2024-01-02 15:06:00")),
                    pos = c(6, 7, 5, 8, NA, 7, 4, NA, 5, 4, 5, 3),
                    neg = c(2, 3, 2, 5, 3, 4, 6, 4, 6, 4, NA, 8))

# Example 1a: Lagged variable for 'pos'
lagged(dat$pos, id = dat$subject, day = dat$day)

# Example 1b: Alternative specification
lagged(dat[, c("pos", "subject", "day")], id = "subject", day = "day")

# Example 1c: Alternative specification using the 'data' argument
lagged(pos, data = dat, id = "subject", day = "day")

# Example 2a: Lagged variable for 'pos' and 'neg'
lagged(dat[, c("pos", "neg")], id = dat$subject, day = dat$day)

# Example 2b: Alternative specification using the 'data' argument
lagged(pos, neg, data = dat, id = "subject", day = "day")

# Example 3: Lag-2 variables for 'pos' and 'neg'
lagged(pos, neg, data = dat, id = "subject", day = "day", lag = 2)

# Example 4: Lagged variable and time difference variable
lagged(pos, neg, data = dat, id = "subject", day = "day", time = "time")

# Example 5: Lagged variables and time difference variables,
# name variables
lagged(pos, neg, data = dat, id = "subject", day = "day", time = "time",
       name = c("p.lag1", "n.lag1"), name.td = c("p.diff", "n.diff"))

# Example 6: NA observations excluded from the data frame
dat.excl <- dat[!is.na(dat$pos), ]

# Number of observation not taken into account, i.e.,
# - observation 4 used as lagged value for observation 6 for subject 1
# - observation 1 used as lagged value for observation 3 for subject 2
lagged(pos, data = dat.excl, id = "subject", day = "day")

# Number of observation taken into account by specifying the 'ob' argument
lagged(pos, data = dat.excl, id = "subject", day = "day", obs = "obs")

misty documentation built on Oct. 24, 2024, 5:10 p.m.

Related to lagged in misty...