fsMTS: Feature selection for multivariate time series

View source: R/fsMTS.R

fsMTSR Documentation

Feature selection for multivariate time series

Description

fsMTS implements algorithms for feature selection in multivariate time series

Usage

fsMTS(
  mts,
  max.lag,
  method = c("ownlags", "distance", "CCF", "MI", "RF", "GLASSO", "LARS", "PSC"),
  show.progress = F,
  localized = F,
  ...
)

Arguments

mts

an matrix object with values of the multivariate time series (MTS) MTS components are by k columns, observations are by rows

max.lag

the maximal lag value

method

a feature selection algorithm. Implemented algorithms:

  • "ownlags" - only own (autoregressive) lags. The method constructs the matrix of features that represents independent AR(max.lag) processes for every MTS component.

    "distance" - distance-based feature selection. The method uses directed distances shortest between every pair of time series components (origin and destination). The lag l is selected as a potential relationship (feature) if the destination component is reachable from the origin component within (l*step) time steps, rounded to integer value. All previous and next lags are not included into the resulting structure.

    "CCF" - cross-correlation-based. The method returns values of Pearson's correlation coefficient between every MTS component and all other MTS components and their lags. See Yang et al.(2005) as an example of application. Only own lags of every MTS component are included as selected features.

  • "MI" - mutual information-based. The method returns values of mutual information between every component of the multivariate time series and all other components and their lags. The method is localized - mutual information is independently estimated for every MTS component and lags (1:max.lag) of all MTS components. See Liu et al. (2016) as an example of application.

  • "RF" - random forest estimation of k linear regression models. The method returns increase of mean square error ( of the multivariate time series and all other components and their lags. The method is localized - the linear regression is independently estimated by the random forest algorithm for every MTS component as a dependent variable and lags (1:max.lag) of all MTS components as explanatory variables. See Pavlyuk (2020) for more details

  • "GLASSO" - feature selection using graphical LASSSO regularisation of the inverse covariance matrix. The method returns values from inverse correlation matrix between every MTS component and all other components and their lags. The method is localized - the sparse inverse correlation matrix is independently estimated for every time series component and lags (1:_max.lag_) of all other components.

  • "LARS" - feature selection using least angle regression. The method returns values of beta proportions from the least angle regression, estimated for every MTS component and all other components and their lags. The method is localized - the least angle regression is independently estimated for every MTS component and lags (1:_max.lag_) of all other components.

  • "PSC" - feature selection using partial spectral coherence of MTS components. The method returns maximal values of the partial spectral coherence function for all MTS lags

show.progress

the logical parameter to print progress of calculation. By default is FALSE.

localized

the logical parameter to executed localized (component-wise) feature selection if the selected method supports this ("MI", "GLASSO", "RF"). Localized versions of algorithms are based on selection of features for independently for every MTS component from all lagged components. Non-localised versions include simulteneous feature selection for all components, including potential instantaneous effects (relationships between feature within the same lag). Leter, non-localised algortihms ignore instantaneous effects and return only lagged features.

By default is TRUE

...

method-specific parameters:

  • "shortest" ("distance" algorithm) matrix of externally provided shortest distances between every pair of time series' components.

  • "step" ("distance" algorithm) distance that covered by the process during one time step of the time series. By default is 1.

  • "rho" ("GLASSO" algorithm) non-negative regularization parameter for lasso. rho=0 means no regularization.

Details

The function implements selection of potential relationships between multivariate time series' components and their lags.

Value

returns a real-valued or binary (depends on the algorithm) feature matrix of k*max.lag rows and k columns, where k is number of time series components (number of columns in the mts parameter). Columns correpond to components of the time series; rows correspond to lags (from 1 to max.lag).

References

Distance-based feature selection for MTS

Pfeifer, P. E., & Deutsch, S. J. 1980. A Three-Stage Iterative Procedure for Space-Time Modeling. Technometrics, 22(1), 35.

Cross-corelation-based feature selection for MTS

Netoff I., Caroll T.L., Pecora L.M., Sciff S.J. 2006. Detecting coupling in the presence of noise and nonlinearity. In: Schelter B, Winterhalder W, Timmer J, editors. Handbook of time series analysis.

Mutual information-based feature selection for MTS

Liu, T., Wei, H., Zhang, K., Guo, W., 2016. Mutual information based feature selection for multivariate time series forecasting, in: 35th Chinese Control Conference (CCC). Presented at the 2016 35th Chinese Control Conference (CCC), IEEE, Chengdu, China, pp. 7110–7114.

Random forest-based feature selection for MTS

Pavlyuk, D., 2020. Random Forest Variable Selection for Sparse Vector Autoregressive Models, in: Valenzuela, O., Rojas, F., Pomares, H., Rojas, I. (Eds.), Theory and Applications of Time Series Analysis. Selected Contributions from ITISE 2019., Contributions to Statistics.

Graphical LASSO-based feature selection for MTS

Haworth, J., Cheng, T., 2014. Graphical LASSO for local spatio-temporal neighbourhood selection, in: Proceedings the GIS Research UK 22nd Annual Conference. Presented at the GIS Research UK 22nd Annual Conference, Leicester, UK, pp. 425–433.

Least angle regression for feature selection for MTS

Gelper S. and Croux C., 2008. Least angle regression for time series forecasting with many predictors, Leuven, Belgium, p.37.

Partial spectral coherence for feature selection for MTS

Davis, R.A., Zang, P., Zheng, T., 2016. Sparse Vector Autoregressive Modeling. Journal of Computational and Graphical Statistics 25, 1077–1096.

Examples


# Load traffic data
data(traffic.mini)

# Scaling is sometimes useful for feature selection
# Exclude the first column - it contains timestamps
data <- scale(traffic.mini$data[,-1])

mIndep<-fsMTS(data, max.lag=3, method="ownlags")
mCCF<-fsMTS(data, max.lag=3, method="CCF")
mDistance<-fsMTS(data, max.lag=3, method="distance", shortest = traffic.mini$shortest, step = 5)
mGLASSO<-fsMTS(data, max.lag=3,method="GLASSO", rho = 0.05)
mLARS<-fsMTS(data, max.lag=3,method="LARS")
mRF<-fsMTS(data, max.lag=3,method="RF")
mMI<-fsMTS(data, max.lag=3,method="MI")
mlist <- list(Independent = mIndep,
              Distance = mDistance,
              CCF = mCCF,
              GLASSO = mGLASSO,
              LARS = mLARS,
              RF = mRF,
              MI = mMI)

th<-0.30
(msimilarity <- fsSimilarityMatrix(mlist,threshold = th, method="Kuncheva"))


fsMTS documentation built on April 26, 2022, 9:05 a.m.