Description Usage Arguments Details Value Examples
View source: R/assemble_factor.R
The function assemble_factor
retrieves time-series from a single
csv-formatted source file to assemble factors in their final form. The source
file must correspond to (a) a valid entry in the internal catalog
contained in the package factorr or (b) a valid entry in the
bindr::derived_catalog registry object controlling factors built with
algebraic and/or econometric manipulations.
1 2 3 4 5 6 7 8 9 | assemble_factor(
nm = NA,
src_hdl,
asset,
trade = 1,
src_dir = NA,
arg_supp = list(),
is_built = FALSE
)
|
nm |
A string representing the factor name. |
src_hdl |
A string representing the source handle. See details. |
asset |
A string or vector of strings, indicating which asset to source. See details. |
trade |
An integer or vector of integers, either +1 or -1, indicating a long (+1) or short (-1) position. See details. |
src_dir |
A string representing an existing path directory where the csv-formatted files reside. |
arg_supp |
A list of supplementary arguments. See details. |
is_built |
Logical value indicating if a factor in the assembly process
has been generated by the function |
The parameter nm should preferably follow the R naming convention.
Note that the function internally enforces the R naming rules by calling
nm <- make.names(nm)
, which may produce a different name from the
user-supplied one. See base::make.names
documentation for details
about R naming convention.
The parameter src_hdl must be a valid internal catalog entry. It
controls the handle from which the time-series will be sourced. The function
factorr::catalog_do('show')
generates the list of available handles
(see column hdl), along with a short description and the original
data source (e.g. Kenneth French Library, Federal Reserve Bank of St.
Louis). An error is generated if an invalid catalog entry is supplied.
Alternatively, if the parameter src_hdl points to a derived
factor, it must map to a valid derived catalog entry. As in the
case above, src_hdl controls the handle from which the time-series
will be sourced. The function bindr::derived_catalog_do('src_hdl')
displays a table of valid entries suitable for the parameter src_hdl.
An error is generated if an invalid derived catalog entry is
supplied.
It should be clear from the above remarks that the parameter src_hdl
can be checked internally against two different catalogs contained either in
package factorr or in package bindr. The parameter
is_built activate an internal dispatch mechanism routing the
src_hdl parameter to the appropriate catalog. Any derived factor
(i.e. produced by calling build_derived_factor()
) must have
is_built == TRUE
to be routed against the internal derived
catalog object. Failure to do so will generate an error.
The parameter asset determines which variables will be selected from
the source file. The function factorr::catalog_do('show_hdl_names', hdl
= src_hdl)
, where src_hdl is a valid catalog entry, generates a
tibble object containing all the variable names associated with a given
src_hdl. An error is generated if asset does not exist in the
source file.
Alternatively, if the parameter src_hdl points to a derived factor, the parameter asset still determines which variables will be selected from the source file. However there is no function to generate a tibble object containing all the variable names associated with a given src_hdl. The user must instead consult the associated audit file or peek at the corresponding csv-formatted file.
Factor times-series are assembled either from a single time-series or from a linear combination of time-series. The former case amounts to extracting asset from the existing source src_hdl and naming the resulting factor nm. The latter case generally involves taking two variables (asset is a string vector) from src_hdl and combining them into long and short positions. In this case trade is an integer vector comprised of either +1 or -1 representing a long and short position, respectively. See examples below. Note that this package currently supports only linear combinations with trade parameters set to either +1 or -1.
Note that an assembly request has no additional constraint besides the
existence of a file containing all the required inputs. This leaves
some latitude to build different versions of the same factor. For
instance, the 'Quality' factor (e.g. operating profitability) can be
built using deciles or can alternatively be constructed with quintiles. The
latitude in defining the factor assembly does not include cases where the
required series are located in different files. Such a case would necessitate
a dedicated function called by bindr::build_derived_factor()
. See
below for additional details.
The latitude in designing factor expression is afforded mostly for exploratory purposes. In particular, factor models are 'locked' to control their design and maintain their integrity. As a direct consequence, a user can't modify an existing factor model by toggling between different factor expressions. Instead, a user exploring the impact of variations in factor expression would have to get the factor model output (typically a tibble/table object) and affix the factor variant. However the factor model audit file would clearly document the original factor model and implicitly confirm any deviation in factor definition.
The parameter src_dir must be a valid and existing directory. An error
is generated if either one of these conditions is not satisfied. The
combination of src_dir and src_hdl identifies the source file
location and name. An error is generated if this combination points to a
non-existent file object. Note also that both parameters can't have multiple
instances, which implies that the assembly process must operate on a
single file to combine its required series. Should a factor require
inputs located in separate files, the function
bindr::build_derived_factor()
should be used instead.
Additional variables (in list arg_supp) can be requested from the source file provided that they exist. The typical use involves year, month or date. An error is generated if any element of the list does not exist in the source file. Note that the returned tibble object puts arg_supp first, then nm. See examples below.
A tibble object comprised of arg_supp and nm time-series, in that order. See details.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | ## Not run:
Value factor from French-Fama 3-Factor US:
Long position in 'hml':
assemble_factor(nm = 'value', src_dir = '~/.../Factor Warehouse/Uncompressed',
src_hdl = 'FF_3F_US_M', asset = 'hml',
trade = 1, arg_supp = list('year','month'))
## End(Not run)
## Not run:
French-Fama Operating Profitability US:
Short position in the lowest decile and long position
in the highest decile:
assemble_factor(nm = 'profit',
src_dir = '~/.../Factor Warehouse/Uncompressed/',
src_hdl = 'FF_OP_US_M', asset = c('Lo.10','Hi.10' ),
trade = c(-1, 1), arg_supp = list('year','month'))
## End(Not run)
## Not run:
Inflation factor from econometric model (hence is_built = TRUE):
Long position in 'shock':
assemble_factor(nm = 'inflation',
src_dir = '~/.../Factor Warehouse/Uncompressed/',
src_hdl = 'INFLATION__naive__US_M', asset = 'shock',
trade = 1, arg_supp = list('year','month'), is_built = T)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.