prepareLaggedData: Organizes time series data into lags.

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Takes a multivariate time series, where at least one variable is meant to be used as a response while the others are meant to be used as predictors in a model, and organizes it in time lags, generating one new column per lag and variable in the model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
prepareLaggedData(
  input.data = NULL,
  response = NULL,
  drivers = NULL,
  time = NULL,
  oldest.sample = "first",
  lags = NULL,
  time.zoom = NULL,
  scale = FALSE
  )

Arguments

input.data

a dataframe with one time series per column.

response

character string, name of the numeric column to be used as response in the model.

drivers

character vector, names of the numeric columns to be used as predictors in the model.

time

character vector, name of the numeric column with the time/age.

oldest.sample

character string, either "first" or "last". When "first", the first row taken as the oldest case of the time series and the last row is taken as the newest case, so ecological memory flows from the first to the last row of input.data. When "last", the last row is taken as the oldest sample, and this is the mode that should be used when input.data represents a palaeoecological dataset. Default behavior is "first".

lags

numeric vector of positive integers, lags to be used in the equation. Generally, a regular sequence of numbers, in the same units as time. The use seq to define it is highly recommended. If 0 is absent from lags, it is added automatically to allow the consideration of a concurrent effect. Lags should take into account the temporal resolution of the data, and be aligned to it. For example, if the interval between consecutive samples is 100 years, lags should be something like 0, 100, 200, 300. Lags can also be multiples of the time resolution, such as 0, 200, 400, 600 (in the case time resolution is 100 years).

time.zoom

numeric vector of two numbers of the time column used to subset the data if desired.

scale

boolean, if TRUE, applies the scale function to normalize the data. Required if the lagged data is going to be used to fit linear models.

Details

The function interprets the time column as an index representing the

Value

A dataframe with columns representing time-delayed values of the drivers and the response. Column names have the lag number as a suffix. The response variable is identified in the output as "Response_0".

Author(s)

Blas M. Benito <blasbenito@gmail.com>

See Also

computeMemory

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#loading data
data(palaeodata)

#adding lags
lagged.data <- prepareLaggedData(
 input.data = palaeodata,
 response = "pollen.pinus",
 drivers = c("climate.temperatureAverage", "climate.rainfallAverage"),
 time = "age",
 oldest.sample = "last",
 lags = seq(0.2, 1, by=0.2),
 time.zoom=NULL,
 scale=FALSE
)
str(lagged.data)

memoria documentation built on May 17, 2019, 9 a.m.