phavok: Parallelized Hankel Alternative View of Koopman (pHAVOK)...

View source: R/phavok.R

phavokR Documentation

Parallelized Hankel Alternative View of Koopman (pHAVOK) Analysis

Description

Parallel HAVOK (phavok) is a parallelized and optimized version of the HAVOK procedure. It estimates multiple HAVOK models simultaneously across multiple selected or randomized hyperparameter sets. phavok() is intended for model selection and inspection of the model fit surfaces. Once the model of interest is selected, it should be refit with the havok().

Note: If the model selected with phavok() has a sparsification parameter ‘lambda’ larger than 0, and the set of its hyperparameters does not result in an approximately equivalent model fit with havok(), a slight adjustment in the ‘lambda’ parameter when fitting the model with havok() will be required to achieve approximately equivalent model fit. To select appropriate adjustment magnitude we recommend inspecting nearby ‘lambda’ values in the ‘stackmax’ and ‘r’ combination that corresponds to the selected model.

Usage

phavok(
  xdat,
  dt = 1,
  stackmaxes = NA,
  rs = NA,
  random = 0,
  sparsify = FALSE,
  sparserandom = 0,
  loops = 1,
  devMethod = "FOCD",
  gllaEmbed = NA,
  alignSVD = TRUE,
  numCores = parallel::detectCores(all.tests = FALSE, logical = TRUE) - 2
)

Arguments

xdat

A vector of measurements over time.

dt

A numeric value indicating the time-lag between two subsequent time series measurements.

stackmaxes

A vector of 'stackmax' hyperparameter values, where 'stackmax' stands for the number of shift-stacked rows in the Hankel matrix.

rs

A vector of 'r' hyperparameter values or NA, where 'r' stands for the number of singular vectors to include (also known as model degree or truncation parameter). If NA is selected, HAVOK models with all possible 'r' hyperparameters within selected 'stackmaxes' will be fit.

random

A numeric value from 0 to 1; what proportion of 'stackmaxes' and 'rs' combinations should be selected randomly? Both 0 and 1 result in fitting all possible models.

sparsify

Logical; should models be sparsified?

sparserandom

A numeric value from 0 to 1; what proportion of sparsification parameters should be selected randomly?

loops

An integer; number of times sequential thresholded least-squares procedure is repeated.

devMethod

A character string; One of either "FOCD" for fourth order central difference or "GLLA" for generalized local linear approximation.

gllaEmbed

An integer; the embedding dimension used for devMethod = "GLLA".

alignSVD

Logical; Whether the singular vectors should be aligned with the data.

numCores

An integer; number of cores to be used by phavok(). If not specified, defaults to the number of cores detected - 2.

Value

An dataframe with the following columns:

  • stackmax - 'stackmax' hyperparameter.

  • r - 'r' hyperparameter

  • R2 - Squared correlation between the model predicted v_1 and v_1 exracted from SVD. A model fit estimate.

  • Lambda - Sparsification threshold.

  • LambdaDeletions - Number of model coefficient matrix elements that were expected to be truncated by the sparsification threshold.

  • TrueLambdaDeletions - Number of model coefficient matrix elements that were truncated by the sparsification threshold.

  • kurtosis - Pearson's measure of kurtosis of the forcing term value distribution.

  • prop2sd - Proportion of the forcing values exceeding the threshold of +-2 standard deviations.

  • prop1.5sd - Proportion of the forcing values exceeding the threshold of +-1.5 standard deviations.

  • prop1sd - Proportion of the forcing values exceeding the threshold of +-1 standard deviation.

References

S. L. Brunton, B. W. Brunton, J. L. Proctor, E. Kaiser, and J. N. Kutz, "Chaos as an intermittently forced linear system," Nature Communications, 8(19):1-9, 2017.

Examples

## Not run: 
# Russian Twitter Troll Activity Example

library(plotly)

data("Internet_Trolls")  # Contains time series of Russian Twitter 
# troll activity extracted 4 times per day during the US presidential  
# election year 2016 on 11 different topics 
right <- results.all.truncated[results.all.truncated$Type=="Right",] # only right-wing trolls
xdat <- right$Topic3  # Russian Twitter troll posting activity on the topic of Racial Justice/Black Lives Matter
dt <- 0.25   # 4 measurements per day

# All possible rs within specified stackmax range, no sparsification dimension

results <- phavok(xdat = xdat, dt = dt, stackmaxes = 28:58)

plot_ly(data = results, type = "scatter", x = ~stackmax,
        y = ~r, colors= "PiYG" , color = ~ R2, size = ~ R2,
        mode = "markers", text = ~R2,
        hovertemplate = paste('stackmax = %{x}',
                              '<br>r = %{y}<br>',
                              'R2 = %{text:.2f}',
                              '<extra></extra>')) %>%
  layout(title = 'R2') 



plot_ly(data = results, type = "scatter", x = ~stackmax,
y = ~r, colors= "BrBG" , color = ~ kurtosis, size = ~ kurtosis*(-1),
mode = "markers", text = ~kurtosis,
hovertemplate = paste('stackmax = %{x}',
                      '<br>r = %{y}<br>',
                      'kurtosis = %{text:.2f}',
                      '<extra></extra>')) %>%
  layout(title = 'kurtosis')

plot_ly(data = results, x = ~stackmax, y = ~r,
colors = "viridis" , color = ~prop2sd, size = ~prop2sd * (-1),
mode = "markers", text = ~prop2sd,
hovertemplate = paste('stackmax = %{x}',
                     '<br>r = %{y}<br>',
                     'prop2sd = %{text:.2f}',
                     '<extra></extra>')) %>%
 layout(title = 'prop2sd')



results <- phavok(xdat = xdat, dt = dt, stackmaxes = 28:58, rs = 2:10, sparsify = T)


plot_ly(data = results, type = 'scatter3d', x = ~stackmax,
       y = ~r, z = ~LambdaDeletions, colors= "PiYG" , color = ~ R2,
      mode = "markers", text = ~R2,
       marker = list(
        size = ~R2*20,
         opacity = 1
       ),
       hovertemplate = paste('stackmax = %{x}',
                             '<br>r = %{y}<br>',
                             'LambdaDeletions = %{z}<br>',
                             'R2 = %{text:.2f}',
                             '<extra></extra>')) %>%
 layout(title = 'R2')


# Selected range of rs in the specified stackmax range with sparsification dimension, 
# 0.3 proportion of stackmax * r values selected randomly

results <- phavok(xdat = xdat, dt = dt, stackmaxes = 28:58, rs = 2:10,  random = 0.3, sparsify = T)


plot_ly(data = results, type = 'scatter3d', x = ~stackmax,
      y = ~r, z = ~LambdaDeletions, colors= "PiYG" , color = ~ R2,
      mode = "markers", text = ~R2,
      marker = list(
        size = ~R2*20,
        opacity = 1
      ),
      hovertemplate = paste('stackmax = %{x}',
                             '<br>r = %{y}<br>',
                             'LambdaDeletions = %{z}<br>',
                            'R2 = %{text:.2f}',
                             '<extra></extra>')) %>%
 layout(title = 'R2')




# Selected range of rs in the specified stackmax range with sparsification dimension, 
# 0.3 proportion of stackmax * r values selected randomly, 0.4 proportion of 
# sparsification thresholds selected randomly


results <- phavok(xdat = xdat, dt = dt, stackmaxes = 28:58, rs = 2:10,
 random = 0.5, sparsify = T, sparserandom = 0.4)



plot_ly(data = results, type = 'scatter3d', x = ~stackmax,
       y = ~r, z = ~LambdaDeletions, colors= "PiYG" , color = ~ R2,
       mode = "markers", text = ~R2,
      marker = list(
        size = ~R2*20,
        opacity = 1
      ),
      hovertemplate = paste('stackmax = %{x}',
                             '<br>r = %{y}<br>',
                            'LambdaDeletions = %{z}<br>',
                             'R2 = %{text:.2f}',
                            '<extra></extra>')) %>%
 layout(title = 'R2')



## End(Not run)


RobertGM111/havok documentation built on July 8, 2023, 8:23 p.m.