knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Estimation of covariance matrix via factor models with application to financial data. Factor models decompose the asset returns into an exposure term to some factors and a residual idiosyncratic component. The resulting covariance matrix contains a low-rank term corresponding to the factors and another full-rank term corresponding to the residual component.
This package provides a function to separate the data into the factor component and residual component, as well as to estimate the corresponding covariance matrix. Different kind of factor models are considered, namely, macroeconomic factor models and statistical factor models. The estimation of the covariance matrix accepts different kinds of structure on the residual term: diagonal structure (implying that residual component is uncorrelated) and block diagonal structure (allowing correlation within sectors). The package includes a built-in database containing stock symbols and their sectors.
The package is based on the book: R. S. Tsay, Analysis of Financial Time Series. John Wiley & Sons, 2005.
# Installation from CRAN (not available yet) #install.packages("covFactorModel") # Installation from GitHub # install.packages("devtools") devtools::install_github("dppalomar/covFactorModel") # Getting help library(covFactorModel) help(package = "covFactorModel") package?covFactorModel ?factorModel ?covFactorModel ?getSectorInfo
For more detailed information, please check the vignette: GitHub-html-vignette and GitHub-pdf-vignette.
factorModel()
The function factorModel()
builds a factor model for the data, i.e., it decomposes the asset returns into a factor component and a residual component. The user can choose different types of factor models, namely, macroeconomic, BARRA, or statistical. We start by generating some synthetic data:
library(covFactorModel) library(xts) library(MASS) # generate synthetic data set.seed(234) N <- 3 # number of stocks T <- 5 # number of samples mu <- rep(0, N) Sigma <- diag(N)/1000 # generate asset returns TxN data matrix X <- xts(mvrnorm(T, mu, Sigma), order.by = as.Date('2017-04-15') + 1:T) colnames(X) <- c("A", "B", "C") # generate K=2 macroeconomic factors econ_fact <- xts(mvrnorm(T, c(0, 0), diag(2)/1000), order.by = index(X)) colnames(econ_fact) <- c("factor1", "factor2")
We first build a macroeconomic factor model, which fits the data to the given macroeconomic factors:
macro_econ_model <- factorModel(X, type = "Macro", econ_fact = econ_fact) # sanity check X_ <- with(macro_econ_model, matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual) norm(X - X_, "F")
Next, we build a BARRA industry factor model (assuming assets A and C belong to sector 1 and asset B to sector 2):
stock_sector_info <- c(1, 2, 1) barra_model <- factorModel(X, type = "Barra", stock_sector_info = stock_sector_info) # sanity check X_ <- with(barra_model, matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual) norm(X - X_, "F")
Finally, we build a statistical factor model, which is based on principal component analysis (PCA):
# set factor dimension as K=2 stat_model <- factorModel(X, K = 2) # sanity check X_ <- with(stat_model, matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual) norm(X - X_, "F")
covFactorModel()
The function covFactorModel()
estimates the covariance matrix of the data based on factor models. The user can choose not only the type of factor model (i.e., macroeconomic, BARRA, or statistical) but also the structure of the residual covariance matrix (i.e., scaled identity, diagonal, block diagonal, and full).
We start by preparing some synthetic data:
library(covFactorModel) library(xts) library(MASS) # generate synthetic data set.seed(234) K <- 1 # number of factors N <- 400 # number of stocks mu <- rep(0, N) beta <- mvrnorm(N, rep(1, K), diag(K)/10) Sigma <- beta %*% t(beta) + diag(N) print(eigen(Sigma)$values[1:10])
Then, we simply use function covFactorModel()
(by default it uses a statistical factor model and a diagonal structure for the residual covariance matrix). We show the average error w.r.t number of observations:
# estimate error by loop err_scm_vs_T <- err_statPCA_diag_vs_T <- c() index_T <- N*seq(5) for (T in index_T) { X <- xts(mvrnorm(T, mu, Sigma), order.by = as.Date('1995-03-15') + 1:T) # use statistical factor model cov_statPCA_diag <- covFactorModel(X, K = K, max_iter = 10) err_statPCA_diag_vs_T <- c(err_statPCA_diag_vs_T, norm(Sigma - cov_statPCA_diag, "F")^2) # use sample covariance matrix err_scm_vs_T <- c(err_scm_vs_T, norm(Sigma - cov(X), "F")^2) } res <- rbind(err_scm_vs_T, err_statPCA_diag_vs_T) rownames(res) <- c("SCM", "stat + diag") colnames(res) <- paste0("T/N=", index_T/N) print(res)
getSectorInfo()
The function getSectorInfo()
provides sector information for a given set of stock symbols. The usage is rather simple:
library(covFactorModel) mystocks <- c("AAPL", "ABBV", "AET", "AMD", "APD", "AA","CF", "A", "ADI", "IBM") getSectorInfo(mystocks)
The built-in sector database can be overidden by providing a stock-sector pairing:
my_stock_sector_database <- cbind(mystocks, c(rep("sector1", 3), rep("sector2", 4), rep("sector3", 3))) getSectorInfo(mystocks, my_stock_sector_database)
Package: GitHub.
README file: GitHub-readme.
Vignette: GitHub-html-vignette and GitHub-pdf-vignette.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.