Metrics are a useful tool in numerically comparing strategies. This document lays out a set of metrics useful for comprehensive strategy evaluation, as well as a framework for statistical evaluation. The approach will be primarily time-series based, which is a departure from traditional analysis involving per-trade metrics. An equity curve time-series analysis provides richer and more generalizable inference.
Let $P_t$ denote a random variable describing the price of an asset at time $t$. The collection ${P_t:t\in \mathcal{T}} \equiv {P}_\mathcal{T}$ forms a random time series over some time domain $\mathcal{T}$.
Consider a strategy variable which converts a price time series into a series of signals or states, which may depend on other parameters including other indicator series, stop losses, etc. It is common to map long signals to positive one, short signals to negative one, and all others to zero. Let this mapped variable be given by $S_t$.
Let $R_t$ denote a (daily) return random variable where $R_t = \frac{P_{t}-P_{t-1}}{P_{t-1}}$. Similarly, let $LR_t = \log(\frac{P_t}{P_{t-1}})$ denote a log-return random variable.
Let $SR_t$ denote a strategy return where $SR_t = R_t\times S_t$, which denotes the returns made in a period using a given strategy. Similarly give the log variant $LSR_t = LR_t\times S_t$
Let $E_t$ denote an equity variable where $E_t = \prod_{s \le t}(1+SR_s)$. Similarly, denote a log-equity variable $LE_t = \sum_{s \le t}LSR_s$. In settings where $E_t$ is used, $\exp(LE_t)$ may be substituted.
Many metrics are enhanced when properly tested. Specifically, it is valuable to test metrics against a strategy which does not depend on price or indicator information, a so-called random strategy, but otherwise behaves similarly in terms of signal frequency. A test is performed using the following steps:
This process helps isolate the predictiveness of a strategy from other factors such as market conditions. Additionally, if simultaneously assessing multiple strategies and isolating the best performing one, this procedure can be replicated to accommodate for testing maximums.
These metrics assess general profitability of a strategy and are closely related to "per-trade" metrics, though they retain more use as testable quantities.
$$ {\tt E}(SR_t) \approx \frac{1}{n}\sum sr_t $$ Comparison to zero is insufficient, but necessary. Testing this quantity is highly suggested as results may be misleading otherwise.
$$ \exp({\tt E}(LSR_t)) \approx \exp\left(\frac{1}{n}\sum lsr_t \right) = \left(\prod\exp(lsr_t)\right)^\frac{1}{n} $$ This will not be the same as the previously computed average return. This quantity preserves desirable symmetry results and may be more reliable when handling compound strategies over large time frames.
If $T$ denotes the last observed time, the total return is $E_T$. On its own, this value is highly unreliable, however testing will reveal underlying significance. Additionally, using log equity may be more reliable.
$$ {\tt E}^+(SR_t) \approx \frac{1}{n^+}\sum sr_t^+ $$ This represents the expected upside given positive return. The notation denotes the average of only the positive returns and the log return equivalent is easily derived. The expected negative is similar:
$$ {\tt E}^-(SR_t) \approx \frac{1}{n^-}\sum sr_t^- $$
Comparing magnitudes of these quantities gives some sense of skewness in returns.
These metrics quantify relative strategy properties, forming a more complete picture of strategy performance.
$$ \frac{{\tt Pr}(SR_t > 0)}{{\tt Pr}(SR_t < 0)} \approx \frac{\sum I(sr_t > 0)}{\sum I(sr_t < 0)} $$ Measures number of wins per loss. $I(\cdot)$ denotes an indicator function. Can be used in conjunction with ratio of partial expected returns. Note the inequality strictness.
$$ {\tt Pr}(SR_t < 0 | R_t > 0) \approx \frac{\sum I(sr_t < 0, r_t > 0)}{\sum I(r_t > 0)};\quad {\tt Pr}(SR_t < 0 | R_t < 0) \approx \frac{\sum I(sr_t < 0, r_t < 0)}{\sum I(r_t < 0)} $$
Measures the percent of incorrect strategy periods, both positive and negative. In both cases the result is a negative strategy return.
$$ {\tt Pr}(SR_t > 0 | R_t > 0) \approx \frac{\sum I(sr_t > 0, r_t > 0)}{\sum I(r_t > 0)};\quad {\tt Pr}(SR_t > 0 | R_t < 0) \approx \frac{\sum I(sr_t > 0, r_t < 0)}{\sum I(r_t < 0)} $$
Measures percent of correct strategy periods, both positive and negative. In both cases the result is a positive strategy return.
$$ {\tt Pr}(SR_t = 0 | R_t > 0) \approx \frac{\sum I(sr_t = 0, r_t > 0)}{\sum I(r_t > 0)};\quad {\tt Pr}(SR_t = 0 | R_t < 0) \approx \frac{\sum I(sr_t = 0, r_t < 0)}{\sum I(r_t < 0)} $$
Measures the percent of upside and downside movements missed through not entering a trade.
These metrics attempt to capture extreme equity curve behavior. Testing these quantities can be done, but interpretation must be done carefully.
$$ {\tt SD}(SR_t) \approx \sqrt{\frac{1}{n-1}\sum(sr_t - \bar{sr})^2} $$ Measures per-period volatility in price. Can be used to determine levels of risk and reward in a single period.
$$ {\tt VaR}_{0.05}(SR_t) \equiv 0.05 = {\tt Pr}(SR_t < {\tt VaR}) \approx \frac{1}{n}\sum I(sr_t < {\tt VaR}) $$
Value defined by an extreme percentile and represents a threshold for extremity.
$$ {\tt TVar}_{0.05}(SR_t) = {\tt E}(SR_t|SR_t < {\tt VaR}) \approx \frac{\sum sr_t\times I(sr_t < {\tt VaR})}{\sum I(sr_t < {\tt VaR})} $$
An extension of the value at risk. Measures expected behavior given the extremity threshold is breached. Note that the inequality reverses when handling reward measures.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.