knitr::opts_knit$set(
  root.dir = normalizePath('..')
)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  echo = TRUE
)

# Load libraries
library(xts)
library(plm)
# library(FactorAnalytics)
# Mentors, please feel free to add yourself to the authors' field if you wish. 

Introduction

Value intro

Data

Get data from: https://github.com/JustinMShea/ExpectedReturns/blob/master/inst/parsers/Devil-in-HML-Details.R

Models

Fama French HML Benchmark

Use the data above to run the standard FF model and save the results

HML Devil in the Details

In the following we briefly review @asness-frazzini-2013 (AF hereafter) and reproduce this work thanks to replication data sets publicly available. Data are updated and maintained by AQR (www.aqr.com). As will be clear from further discussions, for their analysis authors use data until 2010 or 2011, depending on the specific value measures being constructed, compared, and analyzed. Updated data sets go beyond the time frame originally covered in the paper, the most recent data are until May, 31, 2020. First observations dates vary by country, the United States have the longer data series in the sample with the first observation reported on July, 31, 1926.

Value measures

Introducing the value measures contemplated in this work, the first value measure studied is the standard practice pioneered by @fama-french-1992. It is the most widely used measure in the literature and as such it is also referred to as the "standard method". The Fama-French value measure is defined $$ bp_{t}^{a,l} = \ln(B/P_{fye}) $$ where $B$ is the book value per share and $P_{fye}$ indicates the price variable at the end of the fiscal year (rather than in June), both quantities are expressed in local currency. Note that superscripts $a$ and $l$ indicate respectively that this measure is annual (it is updated once a year) and lagged (prices are six to 18 months prior to date, not current prices). The $B/P$ ratio, at times denoted $BM$ is commonly known as the book-to-market ratio.

The second and third values measures, original of this work, are slight modifications of the standard method. As we will outline in further sections, these measures indeed resemble most of the authors research focus: should analysts or investors use lagged price in constructing valuation ratios? The measures, expressed in local currency, are $$ bp_{t}^{a,c} = \ln(B^/P_t) $$ and $$ bp_{t}^{m,c} = \ln(B^/P_t) $$ The superscripts have the following meaning, the former is updated annually and computed using current prices ($c$ superscript), whereas the latter is update monthly ($m$ superscript) and computed using most recent prices as per the June 30 rebalance date ($c$, for current, superscript). In both the two above measures $B^$ stands for the firms' book value per share (or market equity value) that has been adjusted for splits, dividends, and other corporate actions between fiscal year-end and portfolio formation dates. In other words, we can express this financial variable by means of an adjustment ratio: $$ B^ = B\frac{Adj_t}{Adj_{fye}} $$ with $Adj$ the cumulative adjustment factor, which adjusts between the fiscal year-end and the current date $t$.

All the three measures above are related to one another by the following relations, $$ bp_{t}^{a,c} = bp_{t}^{a,l} - r_{fye,t} $$ $$ bp_{t}^{m,c} = bp_{t}^{a,c} - r_{t,t+k} $$ where $$ r_{t,s} = \ln(1 + R_{t,s}) $$ is the total log return between date $t$ and $t < s$.

Portfolio construction

<!-- I have already re-elaborated and shared more or less all about Fama-French Three-factor definitions and construction from their website. We can: - simply retrieve those definitions here as well; - discuss them only in the usual Fama-French model in its own section above (preferred). Anyways, this is precisely why that method was used and time invested in doing so. NOTE: This holds for U.S. calculation based factors, not for the International sample. For the latter I'll add details and will be straightforward as well thereafter. (see here meanwhile: http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html#International)

The $HML^{devil}$ factor introduced by the authors is constructed from value-weighted portfolios with size/book-to-market breakpoints updated every calendar month. Portfolios are rebalanced every calendar month to maintain weights. The factor is defined as the average return on the two values portfolios minus the average return on the two growth portfolios, $$ HML^{devil} = \frac{1}{2}[(\textrm{Small Value} + \textrm{Big Value}) - (\textrm{Small Growth} + \textrm{Big Growth})] $$

What Proxies Best for the True Unobservable Book-to-market ratio?

The authors propose a less lagged version of the standard $BP_{t-1}^{a,l}$, which they define $$ BP_{t-1}^{a,c} = \frac{B_{t-1}}{P_{t,(June)}} $$

"We show that using a more-current price is superior to the standard method of using prices at fiscal year-end as a proxy for the true B/P ratio, and superior in five-factor model regressions." - Asness-Frazzini (2013, p. 49)

"Does the fallen price make this more likely, less likely, or have no effect on whether this should be considered a value stock? The answer depends on how much variation in B/P ratios is due to expected returns and how much is due to changes in future book values. Our findings show that true value stocks often show such price drops, and a measure that takes this fall into consideration, as our proposed method does, is superior to one that ignores it, as the standard method does." - Asness-Frazzini (2013, pp. 49-50)

Which Book-to-market ratio Form Better Value Portfolios?

Country portfolios are aggregated into international and global portfolios using the country's total market capitalization as of the prior month. Further details are provided by @asness-frazzini-2013 in Exhibit A1. In the construction of their original "International sample" (or "Global ex US"), @asness-frazzini-2013 included the following countries: Austria, Australia, Belgium, Canada, Switzerland, Germany, Denmark, Spain, Finland, France, Italy, Japan, Netherlands, Norway, New Zealand, Singapore, Sweden, and United Kingdom. Latest data provided by AQR, extends this list of countries with Ireland, Israel, and Portugal. We study the full panel of countries in the newest International sample, restricting attention to the original sample would certainly be possible but requires explicit weighting of country series. The "Global sample" includes the United States of America as well.

## Run Justin-Erol's parser for Asness-Frazzini (2013) replication data set
parser.path <- system.file('parsers',        
                           'Value--Devil-in-HMLs-Details.R',
                           package='ExpectedReturns')
source(parser.path)

# NOTE: In the paper sample period is 1950-2011 for US, 1983-2011 for Intl. Sample
# NOTE: MKT from AQR is already in excess of 1-month T-bill returns

# NOTE: Given data available STR is only possible to include with respect to the
# US sample. This factor is from French's data library and only contemplates the US.
# Authors omit to provide this factor altogether (see note 18).

### US factors data ###
# MKT.RF  <- HML_Devil.MKT$USA
# SMB     <- HML_Devil.SMB$USA
# HML     <- HML_Devil.HML_FF$USA
# HML.DEV <- HML_Devil.HML.DEV$USA
# UMD     <- HML_Devil.UMD$USA
# REV.ST <- ExpectedReturns::GetFactors('REV', freq='monthly', term='ST') 
# STR <- data.frame(DATE=index(REV.ST) - 1, STR=coredata(REV.ST)) # TODO: recheck "arbitrary" B-o-M 
# STR <- STR[which(STR$DATE %in% FACTORS.DATE), 'REV.ST']
# 
# max.tp <- min(length(FACTORS.DATE), length(REV.ST))
# factors.data <- data.frame(
#   DATE=FACTORS.DATE[1:max.tp],
#   MKT.RF=MKT.RF[1:max.tp],
#   SMB=SMB[1:max.tp],
#   HML=HML[1:max.tp],
#   UMD=UMD[1:max.tp],
#   STR=STR[1:max.tp]/100, # TODO: STR here is in decimal unit 
#   HML.DEV=HML.DEV[1:max.tp]
# )

### International factors data ###
# NOTE: Unfortunately, authors do not distribute any such "Intl. STR".
# MKT.RF.INTL  <- HML_Devil.MKT$`Global Ex USA`
# SMB.INTL     <- HML_Devil.SMB$`Global Ex USA`
# HML.INTL     <- HML_Devil.HML_FF$`Global Ex USA`
# HML.DEV.INTL <- HML_Devil.HML.DEV$`Global Ex USA`
# UMD.INTL     <- HML_Devil.UMD$`Global Ex USA`
# min.tp.intl <- min(
#   sapply(
#     list(MKT.RF.INTL, SMB.INTL, HML.INTL, HML.DEV.INTL, UMD.INTL), 
#     function(x) {
#       min(which(complete.cases(x)))
#     }
#   )
# )
# max.tp.intl <- min(
#   sapply(
#     list(MKT.RF.INTL, SMB.INTL, HML.INTL, HML.DEV.INTL, UMD.INTL), 
#     length
#   )
# )
# factors.data.intl <- data.frame(
#   DATE=FACTORS.DATE[min.tp.intl:max.tp.intl],
#   MKT.RF.INTL=MKT.RF.INTL[min.tp.intl:max.tp.intl],
#   SMB.INTL=SMB.INTL[min.tp.intl:max.tp.intl],
#   HML.INTL=HML.INTL[min.tp.intl:max.tp.intl],
#   HML.DEV.INTL=HML.DEV.INTL[min.tp.intl:max.tp.intl],
#   UMD.INTL=UMD.INTL[min.tp.intl:max.tp.intl]
# )

# Extract AQR Factors Data
##############################################################################################
##############################################################################################
 # Bryan - experimenting with HML.DEV data to work through the error of no DATE column, non matching column names, respectively.
##############################################################################################
##############################################################################################
# factors_date <- data.frame(DATE=index(HML_Devil.HML.DEV), coredata(HML_Devil.HML.DEV))
# dataxts <- xts(factors_date, order.by = factors_date[,1])  
  # as.data.table(HML_Devil.HML.DEV)
# HML_Devil.HML.DEV <- data.frame(DATE=index(HML_Devil.HML.DEV), coredata(HML_Devil.HML.DEV))
# HML_Devil.HML.DEV <- xts(HML_Devil.HML.DEV, order.by = HML_Devil.HML.DEV[,1])
# colnames(HML_Devil.HML.DEV)[27] <- "Global Ex USA"

# HML_Devil.UMD <- data.frame(DATE=index(HML_Devil.UMD), coredata(HML_Devil.UMD))
# HML_Devil.UMD <- xts(HML_Devil.UMD, order.by = HML_Devil.UMD[,1])
# colnames(HML_Devil.UMD)[27] <- "Global Ex USA"
# NEED TO ADD DATE COLUMN SAME WAY I DID FOR HML.DEV ON LINE 290
# HML_
# HML_Devil.HML.DEV$DATE <- as.Date(HML_Devil.HML.DEV$DATE, format = "%Y/%m/%d") # may not work with xts zoo objects - come back to try and fix line 286 WARNING message. 
#############################################################################################
#############################################################################################


FACTORS.DATE <- index(HML_Devil.HML.DEV)
regions.aqr <- c('USA', 'Global Ex USA', 'Global')
regions <- c('US', 'INTL', 'GLOBAL')
factors.vars <- c('MKT.RF', 'SMB', 'HML', 'HML.DEV', 'UMD')
nf <- length(factors.vars)
nr <- length(regions)
MKT.RF <- SMB <- HML <- HML.DEV <- UMD <- matrix(
 NA, length(FACTORS.DATE), length(regions)
)
MKT.RF  <- HML_Devil.MKT[, regions.aqr]
SMB     <- HML_Devil.SMB[, regions.aqr]
HML     <- HML_Devil.HML_FF[, regions.aqr]
HML.DEV <- HML_Devil.HML.DEV[, regions.aqr]
# UMD obs start at later date. 
start.umd.idx <- min(which(FACTORS.DATE %in% index(HML_Devil.UMD)))
UMD[start.umd.idx:length(FACTORS.DATE), ] <- as.matrix(HML_Devil.UMD[, regions.aqr])
UMD <- as.data.frame(UMD)
colnames(UMD) <- regions.aqr

#Convert UMD to an xts object and add a date index

UMD <- xts(UMD, order.by = FACTORS.DATE)



# Merge Factors Data
factors <- list(MKT.RF=MKT.RF, SMB=SMB, HML=HML, HML.DEV=HML.DEV, UMD=UMD)
factors.data <- list()
factors.by.region <- matrix(NA, nf, nr)
for (f in 1:nf) {
  factors.by.region[f, ] <- paste(factors.vars[f], regions, sep='.')
  colnames(factors[[f]]) <- factors.by.region[f, ]
  factors[[f]] <- cbind(factors[[f]], 'DATE.ID'=dimnames(factors[[f]])[[1]])
  factors.data[[f]] <- factors[[f]]
}
names(factors.data) <- factors.vars
factors.data <- Reduce(function(...) {
  merge(..., by='DATE.ID', all=TRUE)
}, factors.data)
factors.data <- data.frame(
  DATE=FACTORS.DATE,
  factors.data
)

factors.data = subset(factors.data, select = -c(by, by.1, by.2, by.3))
# NOTE: We only have stocks data for the US available. Also, is monthly data.
#       So we can only attempt to replicate parts of the paper within these settings for now.
# NOTE: Authors have both a large assets universe and long time-series.
#       I picked "stocksCRSPscoresSPGMIraw" because, although it has less stocks 
#       than "Stock.df" (300 assets), it spans a longer period of time (1993-01-31 to 2015-12-31)
#       within that of interest.
# data("stocksCRSPscoresSPGMIraw")
# vars.keep <- c('Date', 'Ticker', 'Size', 'MPrc', 'MRetx') # assuming "MRetx" is stock excess returns
# assets.data <- as.data.frame(stocksCRSPscoresSPGMIraw[, ..vars.keep])
# colnames(assets.data) <- toupper(vars.keep)
# US and INTL sample observations dates in AF paper
# US.SAMPLE.DATE.PAPER <- seq.Date(as.Date('1950-02-01'), as.Date('2012-01-01'), by='month') - 1
# INTL.SAMPLE.DATE.PAPER <- seq.Date(as.Date('1984-02-01'), as.Date('2012-01-01'), by='month') - 1
# Dates respecting time constraints
# dates.us <- as.Date(intersect(factors.data$DATE, US.SAMPLE.DATE.PAPER))
# dates.intl <- as.Date(intersect(factors.data.intl$DATE, INTL.SAMPLE.DATE.PAPER))
# exposure.vars <- c('MKT.RF', 'SMB', 'HML', 'UMD', 'STR', 'HML.DEV')
# fit.test.fa <- fitFfm(data=data, asset.var="TICKER", ret.var="MRETX", date.var="DATE", 
#                       exposure.vars=exposure.vars, addIntercept=TRUE)
# TODO: all 'MRETX' missing are for VLO during 1993--1997. Was it delisted?
# data[which(is.na(data[, 'MRETX'])), ]

The only $STR$ data available is with respect to the US. We were, and are, obtaining this data from Prof. Kenneth R. French's data library and thus it only contemplates the US. Admittedly, authors omit to provide this factor data series altogether in their replication data sets (see note 18 for a motivation). This is, after all, consistent with their analysis, there isn't anything "wrong" per se. However, if we are to compare results among US, International, and Global sample, it would - of course - be unfeasible to make such a comparison while including the $STR$ for the US only.

# Build time-series regressions formulas
regrds <- matrix(
  c('HML.US', 'HML.DEV.US', 
    'HML.INTL', 'HML.DEV.INTL', 
    'HML.GLOBAL', 'HML.DEV.GLOBAL'), 
  ncol = 2, byrow = TRUE
)
hml.regrs <- hml.dev.regrs <- matrix(NA, nr, 4)
for (j in 1:nr) {
  factors.combn <- combn(factors.by.region[, j], 4)
  i <- j
  # Without "HML.*"
  hml.regrs[i, ] <- factors.combn[, 
    which(!is.na(match(colSums(factors.combn == regrds[i, 1]), 0)))
  ]
  # Without "HML.DEV.*"
  hml.dev.regrs[i, ] <- factors.combn[, 
    which(!is.na(match(colSums(factors.combn == regrds[i, 2]), 0)))
  ]
}
n.hml <- nrow(hml.regrs)
n.hml.d <- nrow(hml.dev.regrs)

# Run HML^{a,c} time-series regressions
ts.regs.hml <- lapply(1:n.hml, function(i) {
  data <- factors.data[, c(regrds[i, 1], hml.regrs[i, ], 'DATE')]
  y <- colnames(data)[1]
  X <- paste(hml.regrs[i, ], collapse='+')
  model.formula <- formula(paste(y, X, sep='~'))
  plm::plm(
    model.formula, data=data,
    model='pooling', index='DATE'
  )
})
## Beta estimates
lapply(ts.regs.hml, summary)
betas.hml <- lapply(ts.regs.hml, coef)
betas.hml <- Reduce(rbind, betas.hml)
colnames(betas.hml) <- c(
  'Alpha', paste('BETA', factors.vars[factors.vars!="HML"], sep='.')
)
row.names(betas.hml) <- regrds[, 1]
knitr::kable(t(betas.hml))

# Run HML^{Devil}, i.e. HML^{m,c}, time-series regressions
ts.regs.hml.dev <- lapply(1:n.hml.d, function(i) {
  data <- factors.data[, c(regrds[i, 2], hml.dev.regrs[i, ], 'DATE')]
  y <- colnames(data)[1]
  X <- paste(hml.dev.regrs[i, ], collapse='+')
  model.formula <- formula(paste(y, X, sep='~'))
  plm::plm(
    model.formula, data=data,
    model='pooling', index='DATE'
  )
})
names(ts.regs.hml.dev) <- paste(tolower(regrds[, 2]), 'reg', sep='.')
hml.dev.summaries <- lapply(ts.regs.hml.dev, summary)
## Beta estimates
betas.hml.dev <- lapply(ts.regs.hml.dev, coef)
names(betas.hml.dev) <- paste(tolower(regrds[, 2]), 'reg.coef', sep='.')
betas.hml.dev <- Reduce(rbind, betas.hml.dev)
colnames(betas.hml.dev) <- c(
  'Alpha', paste('BETA', factors.vars[factors.vars!="HML.DEV"], sep='.')
)
row.names(betas.hml.dev) <- regrds[, 2]
knitr::kable(t(betas.hml.dev))

Value and momentum interaction

The paper also highlights important aspects on the dynamics between value and momentum strategies. In section 3 authors add that:

"...in the presence of momentum, and for logical reasons having to do with the overlap of our value measure and the period used to form momentum, our more timely value measure also outperforms the more standard lagged measure." - Asness-Frazzini (2013, p. 56)

Additional value factors

Inspired by the recent Cliff Assness discussion and AQR paper by the same title, Let's model the value factors he lists. We have some of these factors already in FactorAnaltyics data sets.

https://www.aqr.com/Insights/Perspectives/Is-Systematic-Value-Investing-Dead

Presenting results

Print resulting model estimates in a table, comparing estimates and fit. See Engle for inspiration.

References



JustinMShea/ExpectedReturns documentation built on Sept. 9, 2023, 9:41 p.m.