runsd: Standard Deviation of Moving Windows
In caTools: Tools: Moving Window Statistics, GIF, Base64, ROC AUC, etc

runsd

R Documentation

Standard Deviation of Moving Windows

Description

Moving (aka running, rolling) Window's Standard Deviation calculated over a vector

Usage

  runsd(x, k, center = runmean(x,k), 
        endrule=c("sd", "NA", "trim", "keep", "constant", "func"),
        align = c("center", "left", "right"))

Arguments

`x`	numeric vector of length n or matrix with n rows. If `x` is a matrix than each column will be processed separately.
`k`	width of moving window; must be an integer between one and n. In case of even k's one will have to provide different `center` function, since `runmed` does not take even k's.
`endrule`	character string indicating how the values at the beginning and the end, of the data, should be treated. Only first and last `k2` values at both ends are affected, where `k2` is the half-bandwidth `k2 = k %/% 2`. `"sd"` - applies the `sd` function to smaller and smaller sections of the array. Equivalent to: `for(i in 1:k2) out[i]=mad(x[1:(i+k2)])`. `"trim"` - trim the ends; output array length is equal to `length(x)-2*k2 (out = out[(k2+1):(n-k2)])`. This option mimics output of `apply` `(embed(x,k),1,FUN)` and other related functions. `"keep"` - fill the ends with numbers from `x` vector `(out[1:k2] = x[1:k2])`. This option makes more sense in case of smoothing functions, kept here for consistency. `"constant"` - fill the ends with first and last calculated value in output array `(out[1:k2] = out[k2+1])` `"NA"` - fill the ends with NA's `(out[1:k2] = NA)` `"func"` - same as `"mad"` option except that implemented in R for testing purposes. Avoid since it can be very slow for large windows. Similar to `endrule` in `runmed` function which has the following options: “`c("median", "keep", "constant")`” .
`center`	moving window center. Defaults to running mean (`runmean` function). Similar to `center` in `mad` function.
`align`	specifies whether result should be centered (default), left-aligned or right-aligned. If `endrule`="sd" then setting `align` to "left" or "right" will fall back on slower implementation equivalent to `endrule`="func".

Details

Apart from the end values, the result of y = runmad(x, k) is the same as “for(j=(1+k2):(n-k2)) y[j]=sd(x[(j-k2):(j+k2)], na.rm = TRUE)”. It can handle non-finite numbers like NaN's and Inf's (like mean(x, na.rm = TRUE)).

The main incentive to write this set of functions was relative slowness of majority of moving window functions available in R and its packages. With the exception of runmed, a running window median function, all functions listed in "see also" section are slower than very inefficient “apply(embed(x,k),1,FUN)” approach.

Value

Returns a numeric vector or matrix of the same size as x. Only in case of endrule="trim" the output vectors will be shorter and output matrices will have fewer rows.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

Examples

  # show runmed function
  k=25; n=200;
  x = rnorm(n,sd=30) + abs(seq(n)-n/4)
  col = c("black", "red", "green")
  m=runmean(x, k)
  y=runsd(x, k, center=m)
  plot(x, col=col[1], main = "Moving Window Analysis Functions")
  lines(m    , col=col[2])
  lines(m-y/2, col=col[3])
  lines(m+y/2, col=col[3])
  lab = c("data", "runmean", "runmean-runsd/2", "runmean+runsd/2")
  legend(0,0.9*n, lab, col=col, lty=1 )

  # basic tests against apply/embed
  eps = .Machine$double.eps ^ 0.5
  k=25 # odd size window
  a = runsd(x,k, endrule="trim")
  b = apply(embed(x,k), 1, sd)
  stopifnot(all(abs(a-b)<eps));
  k=24 # even size window
  a = runsd(x,k, endrule="trim")
  b = apply(embed(x,k), 1, sd)
  stopifnot(all(abs(a-b)<eps));
  
  # test against loop approach
  # this test works fine at the R prompt but fails during package check - need to investigate
  k=25; n=200;
  x = rnorm(n,sd=30) + abs(seq(n)-n/4) # create random data
  x[seq(1,n,11)] = NaN;                # add NANs
  k2 = k
  k1 = k-k2-1
  a = runsd(x, k)
  b = array(0,n)
  for(j in 1:n) {
    lo = max(1, j-k1)
    hi = min(n, j+k2)
    b[j] = sd(x[lo:hi], na.rm = TRUE)
  }
  #stopifnot(all(abs(a-b)<eps));
  
  # compare calculation at array ends
  k=25; n=100;
  x = rnorm(n,sd=30) + abs(seq(n)-n/4)
  a = runsd(x, k, endrule="sd" )   # fast C code
  b = runsd(x, k, endrule="func")  # slow R code
  stopifnot(all(abs(a-b)<eps));
  
  # test if moving windows forward and backward gives the same results
  k=51;
  a = runsd(x     , k)
  b = runsd(x[n:1], k)
  stopifnot(all(abs(a[n:1]-b)<eps));
  
  # test vector vs. matrix inputs, especially for the edge handling
  nRow=200; k=25; nCol=10
  x = rnorm(nRow,sd=30) + abs(seq(nRow)-n/4)
  x[seq(1,nRow,10)] = NaN;              # add NANs
  X = matrix(rep(x, nCol ), nRow, nCol) # replicate x in columns of X
  a = runsd(x, k)
  b = runsd(X, k)
  stopifnot(all(abs(a-b[,1])<eps));        # vector vs. 2D array
  stopifnot(all(abs(b[,1]-b[,nCol])<eps)); # compare rows within 2D array

  # speed comparison
  ## Not run: 
  x=runif(1e5); k=51;                       # reduce vector and window sizes
  system.time(runsd( x,k,endrule="trim"))
  system.time(apply(embed(x,k), 1, sd)) 
  
## End(Not run)

caTools documentation built on Sept. 11, 2024, 6:06 p.m.