knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  echo = FALSE,
  cache = TRUE
)
require(quantstrat)
require(knitr)
require(pander)
panderOptions("digits", 2)

source('tiingoapi.R')
# this file contains only one line:
# setDefaults(getSymbols.tiingo, api.key='MYAPIKEY')

require(doParallel) 
registerDoParallel() 

options("getSymbols.warning4.0"=FALSE)
options("getSymbols.yahoo.warning"=FALSE)

Introduction

Who is this Guy ? | Proprietary/Principal Trading { .smaller }

Brian Peterson:

Proprietary Trading:

Backtesting, art or science?

 

 

Back-testing. I hate it - it's just optimizing over history. You never see a bad back-test. Ever. In any strategy. - Josh Diedesch (2014), CalSTRS

 

 

Every trading system is in some form an optimization. - Emilio Tomasini [-@Tomasini2009]

Moving Beyond Assumptions

Many system developers consider

"*I hypothesize that this strategy idea will make money*"

to be adequate.

Instead, strive to:

Constraints and Objectives { .smaller }

Constraints

Benchmarks

Objectives

Building a Hypothesis {.columns-2 .smaller }

  knitr::include_graphics("hypothesis_process.png") 

 

To create a testable idea (a hypothesis):

 

good/complete Hypothesis Statements include:

Tools

R in Finance trade simulation toolchain

  knitr::include_graphics("toolchain.png") 

Building Blocks {.columns-2 .smaller }

  knitr::include_graphics("building_blocks.png") 

Filters

Indicators

Signals

Rules

Installing blotter and quantstrat

install.packages('devtools') # if you don't have it installed
install.packages('PerformanceAnalytics')
install.packages('FinancialInstrument')

devtools::install_github('braverock/blotter')
devtools::install_github('braverock/quantstrat')

Our test strategy - MACD {.columns-2 .smaller}

stock.str <- 'EEM'

currency('USD')
stock(stock.str,currency='USD',multiplier=1)

startDate='2003-12-31'
initEq=100000
portfolio.st='macd'
account.st='macd'

initPortf(portfolio.st,symbols=stock.str)
initAcct(account.st,portfolios=portfolio.st,initEq = initEq)
initOrders(portfolio=portfolio.st)

strategy.st<-portfolio.st
# define the strategy
strategy(strategy.st, store=TRUE)
## get data 
getSymbols(stock.str,
           from=startDate,
           adjust=TRUE,
           src='tiingo')

 

Evaluating the Strategy

Test the System in Pieces | How to Screw Up Less

Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise. - John Tukey [-@Tukey1962] p. 13

 

Fail quickly, think deeply, or both?

 

No matter how beautiful your theory, no matter how clever you are or what your name is, if it disagrees with experiment, it’s wrong. - Richard P. Feynman [-@Feynman1965]

Add the indicator { .columns-2 .smaller }

#MA parameters for MACD
fastMA = 12 
slowMA = 26 
signalMA = 9
maType="EMA"

#one indicator
add.indicator(strategy.st, name = "MACD", 
                  arguments = list(x=quote(Cl(mktdata)),
                                   nFast=fastMA, 
                                   nSlow=slowMA),
                  label='_' 
)

 

 

 

MACD is a two moving average cross system that seeks to measure:

Classical technical analysis, for example only, not widely deployed in production

Measuring Indicators { .smaller }

What do you think you're measuring? A good indicator measures something in the market:

Make sure the indicator is testable:

*If your indicator doesn't have testable information content, throw it out and start over.*

Specifying tests for Indicators

General Diagnostics for Indicators

Add the Signals { .columns-2 .smaller }

#two signals
add.signal(strategy.st,
           name="sigThreshold",
           arguments = list(column="signal._",
                            relationship="gt",
                            threshold=0,
                            cross=TRUE),
           label="signal.gt.zero"
)

add.signal(strategy.st,
           name="sigThreshold",
           arguments = list(column="signal._",
                            relationship="lt",
                            threshold=0,
                            cross=TRUE),
           label="signal.lt.zero"
)

Combining Signals

Signals are often combined:

"$A$ *&* $B$" should both be true.

This is a composite signal, and serves to reduce the dimensionality of the decision space.

A lower dimensioned space is easier to measure, but is at higher risk of overfitting.

Avoid overfitting while combining signals by making sure that your process has a strong economic or theoretical basis before writing code or running tests

Measuring Signals

Signals make predictions so all the literature on forecasting is applicable:

add.distribution(strategy.st,
                 paramset.label = 'signal_analysis',
                 component.type = 'indicator',
                 component.label = '_', 
                 variable = list(n = fastMA),
                 label = 'nFAST'
)

add.distribution(strategy.st,
                 paramset.label = 'signal_analysis',
                 component.type = 'indicator',
                 component.label = '_', 
                 variable = list(n = slowMA),
                 label = 'nSLOW'
)

Run Signal Analysis Study {.smaller .columns-2}

sa_buy <- apply.paramset.signal.analysis(
            strategy.st, 
            paramset.label='signal_analysis', 
            portfolio.st=portfolio.st, 
            sigcol = 'signal.gt.zero',
            sigval = 1,
            on=NULL,
            forward.days=50,
            cum.sum=TRUE,
            include.day.of.signal=FALSE,
            obj.fun=signal.obj.slope,
            decreasing=TRUE,
            verbose=TRUE)
sa_sell <- apply.paramset.signal.analysis(
             strategy.st, 
             paramset.label='signal_analysis', 
             portfolio.st=portfolio.st, 
             sigcol = 'signal.lt.zero',
             sigval = 1,
             on=NULL,
             forward.days=10,
             cum.sum=TRUE,
             include.day.of.signal=FALSE,
             obj.fun=signal.obj.slope,
             decreasing=TRUE,
             verbose=TRUE)

Look at Buy Signal

signal.plot(sa_buy$sigret.by.asset$EEM)

Look at Buy Signal (cont.)

beanplot.signals(sa_buy$sigret.by.paramset$paramset.12.26)

Look at Buy Signal (cont.)

distributional.boxplot(sa_buy$sigret.by.paramset$paramset.12.26$EEM)

Look at Sell Signal

signal.plot(sa_sell$sigret.by.asset$EEM)

Add the Rules { .columns-2 .smaller }

# entry
add.rule(strategy.st,
         name='ruleSignal', 
         arguments = list(sigcol="signal.gt.zero",
                          sigval=TRUE, 
                          orderqty=100, 
                          ordertype='market', 
                          orderside='long', 
                          threshold=NULL),
         type='enter',
         label='enter',
         storefun=FALSE
)
# exit
add.rule(strategy.st,name='ruleSignal', 
         arguments = list(sigcol="signal.lt.zero",
                          sigval=TRUE, 
                          orderqty='all', 
                          ordertype='market', 
                          orderside='long', 
                          threshold=NULL,
                          orderset='exit2'),
         type='exit',
         label='exit'
)

Measuring Rules { .smaller }

If your signal process doesn't have predictive power, stop now.

 

Beware of Rule Burden:

Run the Strategy

start_t<-Sys.time()
out<-applyStrategy(strategy.st , 
                   portfolios=portfolio.st,
                   parameters=list(nFast=fastMA, 
                                   nSlow=slowMA,
                                   nSig=signalMA,
                                   maType=maType),
                   verbose=TRUE)
end_t<-Sys.time()

start_pt<-Sys.time()
updatePortf(Portfolio=portfolio.st)
end_pt<-Sys.time()
print("Running the backtest (applyStrategy):")
print(end_t-start_t)

print("trade blotter portfolio update (updatePortf):")
print(end_pt-start_pt)

Initial Results

chart.Posn(Portfolio=portfolio.st,Symbol=stock.str)
plot(add_MACD(fast=fastMA, slow=slowMA, signal=signalMA,maType="EMA"))

Parameter Optimization

Parameter Optimization

*Every trading system is in some form an optimization. - @Tomasini2009 *

 

What are good parameters?

Limiting the number of parameters

Too Many Free Parameters

Moving from Free to Non-Free Parameters

Robust Parameters

quantstrat Parameters

quantstrat::add.distribution() { .smaller }

.FastMA = (3:15)
.SlowMA = (20:60)
# .nsamples = 200 
#for random parameter sampling, 
# less important if you're using doParallel or doMC

### MA paramset
add.distribution(strategy.st,
                 paramset.label = 'MA',
                 component.type = 'indicator',
                 component.label = '_', #this is the label given to the indicator in the strat
                 variable = list(n = .FastMA),
                 label = 'nFAST'
)

add.distribution(strategy.st,
                 paramset.label = 'MA',
                 component.type = 'indicator',
                 component.label = '_', #this is the label given to the indicator in the strat
                 variable = list(n = .SlowMA),
                 label = 'nSLOW'
)

add.distribution.constraint(strategy.st,
                            paramset.label = 'MA',
                            distribution.label.1 = 'nFAST',
                            distribution.label.2 = 'nSLOW',
                            operator = '<',
                            label = 'MA'
)

quantstrat::apply.paramset() {.smaller }

.paramaudit <- new.env()
ps_start <- Sys.time()
paramset.results  <- apply.paramset(strategy.st, 
                           paramset.label='MA', 
                           portfolio.st=portfolio.st, 
                           account.st=account.st, 
#                          nsamples=.nsamples,
                           audit=.paramaudit,
                           store=TRUE,
                           verbose=FALSE)
ps_end   <- Sys.time()

paramset results {.smaller }

cat("Running the parameter search (apply.paramset): \n ")
print(ps_end-ps_start)
cat("Total trials:",.strategy$macd$trials,"\n")
plot(paramset.results$cumPL[-1,], major.ticks = 'years', grid.ticks.on = 'years')

Search Process

Parameter Regions

Parameter distributions - Profit to Max Drawdown

z <- tapply(X=paramset.results$tradeStats$Profit.To.Max.Draw,
            INDEX=list(paramset.results$tradeStats$nFAST,
                       paramset.results$tradeStats$nSLOW),
            FUN=median)
x <- as.numeric(rownames(z))
y <- as.numeric(colnames(z))
filled.contour(x=x,y=y,z=z,color = heat.colors,
               xlab="Fast MA",ylab="Slow MA")
title("Return to MaxDrawdown")

Overfitting

Things to Watch Out For, or, Types of Overfitting

Look Ahead Bias

Data Mining Bias

Data Snooping

NOTE: We just did all three of these things by optimizing over the entire series

Degrees of Freedom

Pardo [-@Pardo2008, p. 130-131] describes the degrees of freedom of a strategy as:

In parameter optimization, we should consider the sum of observations used by all different parameter combinations.

Goal should be to have 95% or more free parameters or 'free observations' even after parameter search.

Applying Degrees of Freedom calculation

degrees.of.freedom(strategy = 'macd', portfolios = 'macd', paramset.method='trial')
degrees.of.freedom(strategy = 'macd', portfolios = 'macd', paramset.method='sum')

Implications for Torture and Training sets

Multiple testing bias

Investment theory, not computational power, should motivate what experiments are worth conducting. [@Bailey2014deSharpe, p. 10]

 

Deflated Sharpe

@Bailey2014deSharpe describes a way of adjusting the observed Sharpe Ratio of a candidate strategy by taking the variance of the trials and the skewness and kurtosis into account.

Applying Deflated Sharpe Ratio

dsr <- SharpeRatio.deflated(portfolios='macd',strategy='macd', audit=.paramaudit)
kable(dsr)

Haircut Sharpe Ratio

Applying the Sharpe Ratio Haircut {.smaller}

hsr <- SharpeRatio.haircut(portfolios='macd',strategy='macd',audit=.paramaudit)
print(hsr)

Monte Carlo and the bootstrap | Sampling from limited information

History of Monte Carlo and bootstrap simulation

Simulation from the equity curve using daily P&L { .tiny }

Sampling Without replacement:

Sampling With Replacement:

Disadvantages of Sampling from portfolio P&L:

Empirical Example, with replacement {.columns-2}

rsim <- mcsim(  Portfolio = "macd"
               , Account = "macd"
               , n=1000
               , replacement=TRUE
               , l=1, gap=10)
rblocksim <-  mcsim(  Portfolio = "macd"
               , Account = "macd"
               , n=1000
               , replacement=TRUE
               , l=10, gap=10)

P&L Quantiles:

pander(quantile(rsim))
pander(quantile(rblocksim))

Empirical Example, With replacement, cont. {.columns-2}

plot(rsim)
lines (cumsum(dailyEqPL('macd.241',Symbols = stock.str, envir=.paramaudit)),on = 0,col = 'blue')
hist(rsim, cex=.5, methods='maxDD')
plot(rblocksim)
hist(rblocksim, methods='maxDD')

INSERT CSCV/PBO HERE

rm(.paramaudit, paramset.results, ps_start, ps_end)

other bootstrapping methods

Simulation with round turn trades

Dis/advantages of bootstrapping trades

Disadvantages:

Advantages:

Outline of Trade Resampling Process { .smaller }

Extract Stylized Facts from the observed series:

For each replicate:

For the collections of start/qty/duration:

Empirical Example

# nrtxsim <- txnsim( Portfolio = "macd"
#                  , n=250
#                  , replacement=FALSE
#                  , tradeDef = 'increased.to.reduced')

wrtxsim <- txnsim( Portfolio = "macd"
                 , n=250
                 , replacement=TRUE
                 , tradeDef = 'increased.to.reduced')

Comments:

Empirical Example, With replacement {.columns-2}

wrtxsim.pl <- plot(wrtxsim)

P&L Quantiles:

pander(quantile(wrtxsim))
wrplvector<-as.numeric(last(wrtxsim.pl))
btpl<-wrplvector[1]
hist(wrplvector, main='Histogram of Cumulative P&L', breaks=25)
abline(v = btpl, col = "red", lty = 2)
b.label = ("Backtest P&L")
h = rep(0.2 * par("usr")[3] + 1 * par("usr")[4], length(wrplvector))
text(btpl, h, b.label, offset = 0.2, pos = 2, cex = 0.8, srt = 90)
abline(v=median(wrplvector), col = "darkgray", lty = 2)
c.label = ("Sample Median")
text(median(wrplvector), h, c.label, offset = 0.2, pos = 2, cex = 0.8, srt = 90)

Overfitting summary

Walk Forward

Walk Forward { .columns-2 }

{width=40%}

Applying Walk Forward

quantstrat::walk.forward { .columns-2 .smaller }

wfportfolio <- "wf.macd"
initPortf(wfportfolio,symbols=stock.str)
initOrders(portfolio=wfportfolio)
wf_start <- Sys.time()
wfaresults <- walk.forward(strategy.st, 
                           paramset.label='MA', 
                           portfolio.st=wfportfolio, 
                           account.st=account.st, 
#                           nsamples=100,
                           period='months',
                           k.training = 48,
                           k.testing = 12,
                           verbose = FALSE,
                           anchored = FALSE,
                           audit.prefix = NULL,
                           savewf = FALSE,
                           include.insamples = TRUE,
                           psgc=TRUE
                          )
wf_end <-Sys.time()

Walk Forward Results

cat("\n Running the walk forward search: \n ")
print(wf_end-wf_start)
cat(" Total trials:",.strategy$macd$trials,"\n")
kable(wfaresults$tradeStats)

Walk Forward Results (cont.)

chart.forward(wfaresults)

ADD WFA OOS STATS HERE

kable(wfaresults$testing.parameters[c(3,1,2)])

Risk of Ruin

Strong hypotheses guard against risk of ruin.

I hypothesize that this strategy idea will make money.

Specifying hypotheses at the beginning reduces the urge to modify them later and: - adjust expectations while testing - revise the objectives - construct ad hoc hypotheses

seek to answer what and why before going too far

Future Work

*we always have more work than time, so please talk to us if you want to work on these*

Conclusion

{ .smaller }

*Thank You for Your Attention*

 

Thanks to all the contributors to quantstrat and blotter, especially Ross Bennett, Peter Carl, Jasen Mackie, Joshua Ulrich, my team, and my family, who make it possible.

©2018 Brian G. Peterson brian\@braverock.com

This work is licensed under a Creative Commons Attribution 4.0 International License

The rmarkdown [@Rmarkdown] source code for this document may be found on github

si<-sessionInfo()
cat( 'prepared using blotter:',si$otherPkgs$blotter$Version
    ,' and quantstrat:', si$otherPkgs$quantstrat$Version,'\n')

All views expressed in this presentation are those of Brian Peterson, and do not necessarily reflect the opinions, policies, or practices of Brian's employers.

All remaining errors or omissions should be attributed to the author.

References

References {.smaller}



braverock/quantstrat documentation built on Sept. 15, 2023, 11:32 a.m.