rollapply: Apply Rolling Functions

Description Usage Arguments Details Value See Also Examples

View source: R/rollapply.R

Description

A generic function for applying a function to rolling margins of an array.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
rollapply(data, ...)
## S3 method for class 'ts'
rollapply(data, ...)
## S3 method for class 'zoo'
rollapply(data, width, FUN, ..., by = 1, by.column = TRUE, 
    fill = if (na.pad) NA, na.pad = FALSE, partial = FALSE, 
    align = c("center", "left", "right"), coredata = TRUE)
## Default S3 method:
rollapply(data, ...)
rollapplyr(..., align = "right")

Arguments

data

the data to be used (representing a series of observations).

width

numeric vector or list. In the simplest case this is an integer specifying the window width (in numbers of observations) which is aligned to the original sample according to the align argument. Alternatively, width can be a list regarded as offsets compared to the current time, see below for details.

FUN

the function to be applied.

...

optional arguments to FUN.

by

calculate FUN at every by-th time point rather than every point. by is only used if width is length 1 and either a plain scalar or a list.

by.column

logical. If TRUE, FUN is applied to each column separately.

fill

a three-component vector or list (recycled otherwise) providing filling values at the left/within/to the right of the data range. See the fill argument of na.fill for details.

na.pad

deprecated. Use fill = NA instead of na.pad = TRUE.

partial

logical or numeric. If FALSE (default) then FUN is only applied when all indexes of the rolling window are within the observed time range. If TRUE, then the subset of indexes that are in range are passed to FUN. A numeric argument to partial can be used to determin the minimal window size for partial computations. See below for more details.

align

specifyies whether the index of the result should be left- or right-aligned or centered (default) compared to the rolling window of observations. This argument is only used if width represents widths.

coredata

logical. Should only the coredata(data) be passed to every width window? If set to FALSE the full zoo series is used.

Details

If width is a plain numeric vector its elements are regarded as widths to be interpreted in conjunction with align whereas if width is a list its components are regarded as offsets. In the above cases if the length of width is 1 then width is recycled for every by-th point. If width is a list its components represent integer offsets such that the i-th component of the list refers to time points at positions i + width[[i]]. If any of these points are below 1 or above the length of index(data) then FUN is not evaluated for that point unless partial = TRUE and in that case only the valid points are passed.

The rolling function can also be applied to partial windows by setting partial = TRUE For example, if width = 3, align = "right" then for the first point just that point is passed to FUN since the two points to its left are out of range. For the same example, if partial = FALSE then FUN is not invoked at all for the first two points. If partial is a numeric then it specifies the minimum number of offsets that must be within range. Negative partial is interpreted as FALSE.

If width is a scalar then partial = TRUE and fill = NA are mutually exclusive but if offsets are specified for the width and 0 is not among the offsets then the output will be shorter than the input even if partial = TRUE is specified. In that case it may still be useful to specify fill in addition to partial.

If FUN is mean, max or median and by.column is TRUE and width is a plain scalar and there are no other arguments then special purpose code is used to enhance performance. Also in the case of mean such special purpose code is only invoked if the data argument has no NA values. See rollmean, rollmax and rollmedian for more details.

Currently, there are methods for "zoo" and "ts" series and "default" method for ordinary vectors and matrices.

rollapplyr is a wrapper around rollapply that uses a default of align = "right".

If data is of length 0, data is returned unmodified.

Value

A object of the same class as data with the results of the rolling function.

See Also

rollmean

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
suppressWarnings(RNGversion("3.5.0"))
set.seed(1)

## rolling mean
z <- zoo(11:15, as.Date(31:35))
rollapply(z, 2, mean)

## non-overlapping means
z2 <- zoo(rnorm(6))
rollapply(z2, 3, mean, by = 3)      # means of nonoverlapping groups of 3
aggregate(z2, c(3,3,3,6,6,6), mean) # same

## optimized vs. customized versions
rollapply(z2, 3, mean)   # uses rollmean which is optimized for mean
rollmean(z2, 3)          # same
rollapply(z2, 3, (mean)) # does not use rollmean


## rolling regression:
## set up multivariate zoo series with
## number of UK driver deaths and lags 1 and 12
seat <- as.zoo(log(UKDriverDeaths))
time(seat) <- as.yearmon(time(seat))
seat <- merge(y = seat, y1 = lag(seat, k = -1),
  y12 = lag(seat, k = -12), all = FALSE)

## run a rolling regression with a 3-year time window
## (similar to a SARIMA(1,0,0)(1,0,0)_12 fitted by OLS)
rr <- rollapply(seat, width = 36,
  FUN = function(z) coef(lm(y ~ y1 + y12, data = as.data.frame(z))),
  by.column = FALSE, align = "right")

## plot the changes in coefficients
## showing the shifts after the oil crisis in Oct 1973
## and after the seatbelt legislation change in Jan 1983
plot(rr)


## rolling mean by time window (e.g., 3 days) rather than
## by number of observations (e.g., when these are unequally spaced):
#
## - test data
tt <- as.Date("2000-01-01") + c(1, 2, 5, 6, 7, 8, 10)
z <- zoo(seq_along(tt), tt)
## - fill it out to a daily series, zm, using NAs
## using a zero width zoo series g on a grid
g <- zoo(, seq(start(z), end(z), "day"))
zm <- merge(z, g)
## - 3-day rolling mean
rollapply(zm, 3, mean, na.rm = TRUE, fill = NA)
##
## - without expansion to regular grid: find interval widths
## that encompass the previous 3 days for each Date
w <- seq_along(tt) - findInterval(tt - 3, tt)
## a solution to computing the widths 'w' that is easier to read but slower
## w <- sapply(tt, function(x) sum(tt >= x - 2 & tt <= x))
##
## - rolling sum from 3-day windows
## without vs. with expansion to regular grid
rollapplyr(z, w, sum)
rollapplyr(zm, 3, sum, partial = TRUE, na.rm = TRUE)


## rolling weekly sums (with some missing dates)
z <- zoo(1:11, as.Date("2016-03-09") + c(0:7, 9:10, 12))
weeksum <- function(z) sum(z[time(z) > max(time(z)) - 7])
zs <- rollapplyr(z, 7, weeksum, fill = NA, coredata = FALSE)
merge(value = z, weeksum = zs)


## replicate cumsum with either 'partial' or vector width 'k'
cumsum(1:10)
rollapplyr(1:10, 10, sum, partial = TRUE)
rollapplyr(1:10, 1:10, sum)


## different values of rule argument
z <- zoo(c(NA, NA, 2, 3, 4, 5, NA))
rollapply(z, 3, sum, na.rm = TRUE)
rollapply(z, 3, sum, na.rm = TRUE, fill = NULL)
rollapply(z, 3, sum, na.rm = TRUE, fill = NA)
rollapply(z, 3, sum, na.rm = TRUE, partial = TRUE)

# this will exclude time points 1 and 2
# It corresonds to align = "right", width = 3
rollapply(zoo(1:8), list(seq(-2, 0)), sum)

# but this will include points 1 and 2
rollapply(zoo(1:8), list(seq(-2, 0)), sum, partial = 1)
rollapply(zoo(1:8), list(seq(-2, 0)), sum, partial = 0)

# so will this
rollapply(zoo(1:8), list(seq(-2, 0)), sum, fill = NA)

# by = 3, align = "right"
L <- rep(list(NULL), 8)
L[seq(3, 8, 3)] <- list(seq(-2, 0))
str(L)
rollapply(zoo(1:8), L, sum)

rollapply(zoo(1:8), list(0:2), sum, fill = 1:3)
rollapply(zoo(1:8), list(0:2), sum, fill = 3)

L2 <- rep(list(-(2:0)), 10)
L2[5] <- list(NULL)
str(L2)
rollapply(zoo(1:10), L2, sum, fill = "extend")
rollapply(zoo(1:10), L2, sum, fill = list("extend", NULL))

rollapply(zoo(1:10), L2, sum, fill = list("extend", NA))

rollapply(zoo(1:10), L2, sum, fill = NA)
rollapply(zoo(1:10), L2, sum, fill = 1:3)
rollapply(zoo(1:10), L2, sum, partial = TRUE)
rollapply(zoo(1:10), L2, sum, partial = TRUE, fill = 99)

rollapply(zoo(1:10), list(-1), sum, partial = 0)
rollapply(zoo(1:10), list(-1), sum, partial = TRUE)

rollapply(zoo(cbind(a = 1:6, b = 11:16)), 3, rowSums, by.column = FALSE)

# these two are the same
rollapply(zoo(cbind(a = 1:6, b = 11:16)), 3, sum)
rollapply(zoo(cbind(a = 1:6, b = 11:16)), 3, colSums, by.column = FALSE)

# these two are the same
rollapply(zoo(1:6), 2, sum, by = 2, align = "right")
aggregate(zoo(1:6), c(2, 2, 4, 4, 6, 6), sum)

# these two are the same
rollapply(zoo(1:3), list(-1), c)
lag(zoo(1:3), -1)

# these two are the same
rollapply(zoo(1:3), list(1), c)
lag(zoo(1:3))

# these two are the same
rollapply(zoo(1:5), list(c(-1, 0, 1)), sum)
rollapply(zoo(1:5), 3, sum)

# these two are the same
rollapply(zoo(1:5), list(0:2), sum)
rollapply(zoo(1:5), 3, sum, align = "left")

# these two are the same
rollapply(zoo(1:5), list(-(2:0)), sum)
rollapply(zoo(1:5), 3, sum, align = "right")

# these two are the same
rollapply(zoo(1:6), list(NULL, NULL, -(2:0)), sum)
rollapply(zoo(1:6), 3, sum, by = 3, align = "right")

# these two are the same
rollapply(zoo(1:5), list(c(-1, 1)), sum)
rollapply(zoo(1:5), 3, function(x) sum(x[-2]))

# these two are the same
rollapply(1:5, 3, rev)
embed(1:5, 3)

# these four are the same
x <- 1:6
rollapply(c(0, 0, x), 3, sum, align = "right") - x
rollapply(x, 3, sum, partial = TRUE, align = "right") - x
rollapply(x, 3, function(x) sum(x[-3]), partial = TRUE, align = "right")
rollapply(x, list(-(2:1)), sum, partial = 0)

# same as Matlab's buffer(x, n, p) for valid non-negative p
# See http://www.mathworks.com/help/toolbox/signal/buffer.html
x <- 1:30; n <- 7; p <- 3
t(rollapply(c(rep(0, p), x, rep(0, n-p)), n, by = n-p, c))

# these three are the same
y <- 10 * seq(8); k <- 4; d <- 2
# 1
# from http://ucfagls.wordpress.com/2011/06/14/embedding-a-time-series-with-time-delay-in-r-part-ii/
Embed <- function(x, m, d = 1, indices = FALSE, as.embed = TRUE) {
    n <- length(x) - (m-1)*d
    X <- seq_along(x)
    if(n <= 0)
        stop("Insufficient observations for the requested embedding")
    out <- matrix(rep(X[seq_len(n)], m), ncol = m)
    out[,-1] <- out[,-1, drop = FALSE] +
        rep(seq_len(m - 1) * d, each = nrow(out))
    if(as.embed)
        out <- out[, rev(seq_len(ncol(out)))]
    if(!indices)
        out <- matrix(x[out], ncol = m)
    out
}
Embed(y, k, d)
# 2
rollapply(y, list(-d * seq(0, k-1)), c)
# 3
rollapply(y, d*k-1, function(x) x[d * seq(k-1, 0) + 1])


## mimic convolve() using rollapplyr()
A <- 1:4
B <- 5:8
## convolve(..., type = "open")
cross <- function(x) x 
rollapplyr(c(A, 0*B[-1]), length(B), cross, partial = TRUE)
convolve(A, B, type = "open")

# convolve(..., type = "filter")
rollapplyr(A, length(B), cross)
convolve(A, B, type = "filter")

parsifal9/test documentation built on Dec. 31, 2020, 1:14 a.m.