make_features: Generates a list of features from a universe of assets

Description Usage Arguments Details Value See Also

Description

MODIFICATIONS July 2017 - Removed the -1 adjustement to the division operation (_) to ensure the division has not bias. - Therefore, need to implement the addition / subtraction operations - use + and - for operators, but these get lost when cbinding matrices. - therefore modify xtsbind to ensure hard colnames binding to keep + and -

Usage

1
2
make_features(prices, features, smooth = NA, on = "days",
  target = NA, by_symbol = FALSE)

Arguments

prices

An xts matrix of daily asset prices, where each asset is a column, and each matrix entry is the asset price information from which the set of features will be computed.

features

A character vector string naming the features to be calculated. See Details for the valid format.

smooth

A named list that specifies whether the prices should be smoothed (filtered) before calculating the named simple feature. This is useful because certain features are sensitive to high frequency noise. A simple feature that is part of a compounded feature can also be specified for smoothing, even if it does not appear as a stand alone simple feature in the features vector. Unless a feature is explicitly named in argument 'smooth', then no filtering is applied to that feature. The format of the list is as follows:

  • feature_name = "filter", where feature_name is the name of the feature (without quotes) and "filter" is the character string specifying how to filter the price series. Two filtering methods are valid: sma and ema. The number of periods must also be specified. For example, "sma10" means a 10 day sma filter.

on

Specifies whether the computed features should be sampled at endpoints in the time series using function endpoints. Default is on = "days", which has no effect on the daily series. Other typical values could be "weeks", "months", "quarters" or "years". NOTE: THIS DOES NOT WORK YET, EXCEPT FOR DAILY SERIES on = "days"

target

The feature name that will be used as the target for an ML model. It is computed like any other feature, except that is it placed as the first item in the list returned, and renamed to 'y'. Default is NA, which means no target is specified.

by_symbol

Specifies the format of the features returned. If TRUE, then a list of symbols, each containing an xts of the features for that symbol is returned. If FALSE, then the opposite is returned - that is, a list of features, with each containing an xts of symbols.

Details

Generates a list of xts matrices, where each matrix in the list corresponds to one feature. The specified universe of asset should be an xts matrix of daily prices.

If one of the feature corresponds to a target variable (y in a a machine learning model), then specifying its name using the 'target' argument will automatically rename it to column y and place it as the first item in the returned list. This is useful to build the feature matrix used by machine learning models using function make_featuremat.

Some features are calculated from the daily prices whereas others are calculated from daily returns. Features calculated from prices include:

To reduce high frequency noise, the prices used for these may be averaged a priori using either an sma or ema function, specified using the 'smooth' argument.

In addition, some features are calculated from daily returns. Features calculated from returns include:

If the feature is specified in argument 'smooth', then the returns used to compute that feature is first smoothed using either an ema or sma filter. See argument 'smooth' for details.

It is also possible to smooth a feature after its computation. This has the effect of reducing high frequency noise while allowing high frequency anomalies to carry through in the raw calculation of the feature. If such post-processing on a feature is desired, then a post-processing tag is appended to the feature tag. See below for format.

Basic feature format

The format for specifying features and any postprocessing, if desired, is as follows: <feature><parameter><post-processing>, where:

For example, "sd5(sma10)" means to first take the 5 day rolling standard deviation, then smooth the results using a 10 day sma filter.

Complex features

Compounded features can also be created by multiplying and dividing basic features. This is achieved by using the dot to multiply two features, and the underscore to divide two features.

Formating examples:

Value

A list containing as many xts matrices as there are features specified by the features argument and, optionally, the target argument. Each matrix has the same number of rows and columns as argument 'prices' (corresponding to each asset). Each list item is named according to the features name (except for the target feature, if specified, which is renamed to 'y').

See Also

make_featuremat


jeanmarcgp/xtsanalytics documentation built on May 19, 2019, 12:38 a.m.