Description Usage Arguments Details Value See Also
MODIFICATIONS July 2017 - Removed the -1 adjustement to the division operation (_) to ensure the division has not bias. - Therefore, need to implement the addition / subtraction operations - use + and - for operators, but these get lost when cbinding matrices. - therefore modify xtsbind to ensure hard colnames binding to keep + and -
1 2 | make_features(prices, features, smooth = NA, on = "days",
target = NA, by_symbol = FALSE)
|
prices |
An xts matrix of daily asset prices, where each asset is a column, and each matrix entry is the asset price information from which the set of features will be computed. |
features |
A character vector string naming the features to be calculated. See Details for the valid format. |
smooth |
A named list that specifies whether the prices should be smoothed (filtered) before calculating the named simple feature. This is useful because certain features are sensitive to high frequency noise. A simple feature that is part of a compounded feature can also be specified for smoothing, even if it does not appear as a stand alone simple feature in the features vector. Unless a feature is explicitly named in argument 'smooth', then no filtering is applied to that feature. The format of the list is as follows:
|
on |
Specifies whether the computed features should be sampled at endpoints in the time series using function endpoints. Default is on = "days", which has no effect on the daily series. Other typical values could be "weeks", "months", "quarters" or "years". NOTE: THIS DOES NOT WORK YET, EXCEPT FOR DAILY SERIES on = "days" |
target |
The feature name that will be used as the target for an ML model. It is computed like any other feature, except that is it placed as the first item in the list returned, and renamed to 'y'. Default is NA, which means no target is specified. |
by_symbol |
Specifies the format of the features returned. If TRUE, then a list of symbols, each containing an xts of the features for that symbol is returned. If FALSE, then the opposite is returned - that is, a list of features, with each containing an xts of symbols. |
Generates a list of xts matrices, where each matrix in the list corresponds to one feature. The specified universe of asset should be an xts matrix of daily prices.
If one of the feature corresponds to a target variable (y in a a machine
learning model), then specifying its name using the 'target' argument will
automatically rename it to column y and place it as the first item
in the returned list. This is useful to build the feature matrix used
by machine learning models using function make_featuremat
.
Some features are calculated from the daily prices whereas others are calculated from daily returns. Features calculated from prices include:
sma which uses the TTR::SMA function
ema which uses the TTR::EMA function
mom which uses the TTR::ROC function
To reduce high frequency noise, the prices used for these may be averaged a priori using either an sma or ema function, specified using the 'smooth' argument.
In addition, some features are calculated from daily returns. Features calculated from returns include:
sd which uses the stats::sd function
rets2 calculates the square of daily returns NOT IMPLEMENTED
rets3 calculates the cube of daily returns NOT IMPLEMENTED
If the feature is specified in argument 'smooth', then the returns used to compute that feature is first smoothed using either an ema or sma filter. See argument 'smooth' for details.
It is also possible to smooth a feature after its computation. This has the effect of reducing high frequency noise while allowing high frequency anomalies to carry through in the raw calculation of the feature. If such post-processing on a feature is desired, then a post-processing tag is appended to the feature tag. See below for format.
The format for specifying features and any postprocessing, if desired, is as follows: <feature><parameter><post-processing>, where:
feature is the name of the feature as described above,
parameter is a number indicated the rolling window size in days,
post-processing is an optional smoothing filter argument. It must be enclosed in parenthesis to be valid. The filter specifications is either an ema or sma followed by the filter period in days.
For example, "sd5(sma10)" means to first take the 5 day rolling standard deviation, then smooth the results using a 10 day sma filter.
Compounded features can also be created by multiplying and dividing basic features. This is achieved by using the dot to multiply two features, and the underscore to divide two features.
Formating examples:
"sd5.mom10" is equivalent to: sd5 * mom10.
"mom5.sd5(ema5)_sd5(sma21) is equivalent to: mom5 * ema(sd5, 5) / sma(sd5, 21)
A list containing as many xts matrices as there are features specified by the features argument and, optionally, the target argument. Each matrix has the same number of rows and columns as argument 'prices' (corresponding to each asset). Each list item is named according to the features name (except for the target feature, if specified, which is renamed to 'y').
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.