library(ragtop)
library(futile.logger)
library(ggplot2)
library(reshape2)
library(stringr)
library(MASS)
flog.threshold(ERROR)
flog.threshold(ERROR, name='ragtop.implicit.timestep.construct_tridiagonals')
flog.threshold(ERROR, name='ragtop.calibration.implied_volatility.lowprice')
flog.threshold(ERROR, name='ragtop.calibration.implied_volatility_with_term_struct')
flog.threshold(ERROR, name='ragtop.implicit.setup.width')

knitr::opts_chunk$set(fig.width=6.5, fig.height=4, fig.path='Figs/',
                      echo=FALSE, warning=FALSE, message=FALSE, comment=FALSE)
# PALETTES: see http://www.r-bloggers.com/the-paul-tol-21-color-salute/
# FOCUS PALETTES
# Red as highlight
redfocus = c("#CB181D", "#252525", "#525252", "#737373", "#969696", "#BDBDBD", "#D9D9D9", "#F0F0F0")

# Green as highlight
greenfocus = c("#41AB5D", "#252525", "#525252", "#737373", "#969696", "#BDBDBD", "#D9D9D9", "#F0F0F0")

# Blue as highlight
bluefocus = c("#0033FF", "#252525", "#525252", "#737373", "#969696", "#BDBDBD", "#D9D9D9", "#F0F0F0")

# EQUAL WEIGHT
# Generated with rainbow(12, s = 0.6, v = 0.75)
rainbow12equal = c("#BF4D4D", "#BF864D", "#BFBF4D", "#86BF4D", "#4DBF4D", "#4DBF86", "#4DBFBF", "#4D86BF", "#4D4DBF", "#864DBF", "#BF4DBF", "#BF4D86")
rainbow10equal = c("#BF4D4D", "#BF914D", "#A8BF4D", "#63BF4D", "#4DBF7A", "#4DBFBF", "#4D7ABF", "#634DBF", "#A84DBF", "#BF4D91")
rainbow8equal = c("#BF4D4D", "#BFA34D", "#86BF4D", "#4DBF69", "#4DBFBF", "#4D69BF", "#864DBF", "#BF4DA3")
rainbow6equal = c("#BF4D4D", "#BFBF4D", "#4DBF4D", "#4DBFBF", "#4D4DBF", "#BF4DBF")

# Generated with package "gplots" function rich.colors(12)
rich12equal = c("#000040", "#000093", "#0020E9", "#0076FF", "#00B8C2", "#04E466", "#49FB25", "#E7FD09", "#FEEA02", "#FFC200", "#FF8500", "#FF3300")
rich10equal = c("#000041", "#0000A9", "#0049FF", "#00A4DE", "#03E070", "#5DFC21", "#F6F905", "#FFD701", "#FF9500", "#FF3300")
rich8equal = c("#000041", "#0000CB", "#0081FF", "#02DA81", "#80FE1A", "#FDEE02", "#FFAB00", "#FF3300")
rich6equal = c("#000043", "#0033FF", "#01CCA4", "#BAFF12", "#FFCC00", "#FF3300")

# Generated with package "fields" function tim.colors(12), which is said to emulate the default matlab colorset
tim12equal = c("#00008F", "#0000EA", "#0047FF", "#00A2FF", "#00FEFF", "#5AFFA5", "#B5FF4A", "#FFED00", "#FF9200", "#FF3700", "#DB0000", "#800000")
tim10equal = c("#00008F", "#0000FF", "#0070FF", "#00DFFF", "#50FFAF", "#BFFF40", "#FFCF00", "#FF6000", "#EF0000", "#800000")
tim8equal = c("#00008F", "#0020FF", "#00AFFF", "#40FFBF", "#CFFF30", "#FF9F00", "#FF1000", "#800000")
tim6equal = c("#00008F", "#005AFF", "#23FFDC", "#ECFF13", "#FF4A00", "#800000")

# Generated with sort(brewer.pal(8,"Dark2")) #Dark2, Set2
dark8equal = c("#1B9E77", "#666666", "#66A61E", "#7570B3", "#A6761D", "#D95F02", "#E6AB02", "#E7298A")
dark6equal = c("#1B9E77", "#66A61E", "#7570B3", "#D95F02", "#E6AB02", "#E7298A")
set8equal = c("#66C2A5", "#8DA0CB", "#A6D854", "#B3B3B3", "#E5C494", "#E78AC3", "#FC8D62", "#FFD92F")
set6equal = c("#66C2A5", "#8DA0CB", "#A6D854", "#E78AC3", "#FC8D62", "#FFD92F")

Introduction

Financial derivatives generally fall into one of just a few classes: fixed income, equities, foreign exchange, commodities, and energy. Pricing models for these derivatives usually involve standard approaches distinct to each of these asset classes. For example, equity derivatives are priced on variants of the famous Black-Scholes model, while tranche protection on mortgage-backed securities uses copulas with marginal Poisson distributions. Convertible bonds are one of the few types of derivative securities straddling asset classes, and whose valuation must be linked to reasonable models of multiple asset types, involving equity valuations, fixed income and sometimes foreign exchange.

A convertible bond is similar to a corporate bond^[A standard corporate bond is often called a straight bond], promising coupons and notional payments at some known set of future dates, but with a twist. The bond holder, who has effectively lent money to the issuer, can choose to convert the bond into equity (subject to some restrictions), in a varying amount known as the conversion value. The bond value therefore depends on three major processes:

Of these processes, the changes in equity value are most important, followed closely by issuer default^[One might wonder how a fixed-income security could be relatively insensitive to stochastic interest rates. Most convertibles are issued in countries with stable economies, so the rates are generally far less variable and far smaller than the credit spreads of the bond issuers.]. We discuss derivative pricing and calibration based on simply linked models of equities and corporate defaults. Though convertible bonds are our most important case, we also consider mandatories, compound options, and portfolios of equity options and credit derivatives on the same underlying company.

Stochastic Model

Our basic stochastic model^[More explicitly we could write $$ \frac{dS_t}{S_t}=(r(t)+h(S_t,t)-q(t)) dt + \sigma(t) dZ - dJ(h(S_t,t)) $$] links equity values $S_t$ with hazard rate or default intensity $h$ $$ \frac{dS_t}{S_t}=(r+h-q) dt + \sigma dZ - dJ $$ where $r$ and $q$ play their usual roles, $h$ is a deterministic function of stock price and time, and $J$ is a Poisson jump process [see @Lin06; @And04; and @WuC06] adapted to the default intensity or hazard rate $h$. This model is a jump-diffusion extension of Black-Scholes, with the jump process $J$ representing default, compensated [@Mer76] by extra drift{\footnote{The extra compensatory drift can provide some surprising pricing effects when $h$ is large.}} in the equity at rate $h$.

In addition to these "normal" stochastic dynamics, we will assume that on a known future set of dividend dates $T_k$, the stock will pay dividends in the amount $\kappa_k+\pi_k S_{T_k}$, a combination of fixed and proportional values^[In most cases, corporate dividends are "sticky" and likely to be held constant over short time horizons. As time increases, the amount is ever more uncertain, making proportional assignments more reasonable.].

As an important parsimonious example we may take the default intensity $h$ to be given by a shifted power law [@Tak01] $$ h(S,t) = h_c + \alpha(t) S^p $$ where $\alpha(\cdot)$ is a deterministic function of time and the power $p$ is a constant generally between -10 and -1. In practice, the volatility $\sigma$, borrow cost component of $q$ and (less frequently) even the short rate $r$ can often be set to constants, though we do not make that assumption here.

Note that our stochastic model is nominally two-dimensional, but since the jump process is discrete we can nearly always collapse analyses into a single dimension, at considerable computational savings.

Arguments of Intuition

A model is only useful in practice if we use it to tell us something about the world. In order to do so with this model, we require some way of choosing its parameters.

Only some of our parameters are likely to be well-understood and readily specified by practitioners:

Our remaining model parameters are unobservable and require more careful consideration. Generally speaking, we require some way of inferring them from observable current or historical market data familiar to practitioners.

Survival Probabilities and Independent Default Intensity $\tilde{h}$

Equity default observations are almost infinitely sparser than equity price observations^[Equity defaults are also significantly sparser than major corporate events such as takeovers, spin-offs and special dividends. Modelers do not generally treat corporate events because their details and effects are a complete modeling quagmire.]. Default models and specifications are correspondingly crude in comparison to Black-Scholes. We think of default intensity (independent of stock price) as being based on a sequence ${H_i, T_i}$ such that default time $T_d$ satisfies $$ \text{Prob}( T_d > T_i ) = \exp\left({-H_i T_i}\right). $$ Where possible, the $H_i$ are observed credit default swap (CDS) spreads. Taking $T_0=0$ we then find a step function $\tilde{h}(t)$ with left-discontinuities at ${T_i}$ such that $$ H_i T_i = \sum_{j \leq i} h(T_j) (T_j-T_{j-1}) $$

The survival probability to any time $t$ is then given by the Poisson distribution function of $$ \text{Prob}( T_d > t ) = \exp{\left(-\int_0^t \tilde{h}(s) ds \right)} $$

In most cases this is trivial, since only one spread observation ${H_1, T_1=5\text{yr}}$ is available. When CDS spreads are unavailable, it is common practice to approximate $H$ with an historical average from a ratings agency lookup table.

For shifted power laws we can match a wide variety of survival probability functions by choosing various intensity modifiers $\alpha(t)$. A particularly important case is the constant-intensity target with rate $h_c$ $$ \text{Prob}( T_d > t ) = e^{-h_c t}. $$ This specification is compatible with a simple extension to the standard Black Scholes model and the associated Black-Scholes formula for European-exercise options.

Stock Volatility $\sigma$

We generally think of stock volatility in terms of equity option markets. For that reason, we expect to "imply" an appropriate value for $\sigma^{\text{impl}}$ from equity options where possible. However, it is important to keep in mind that $\sigma^{\text{impl}}$ is inevitably different from the default-free $\sigma^{\text{impl}}{\text{BS}}$ in the Black-Scholes model. The two will be closely related, but since some of the price variability is represented by the jump process $dJ$, we will have $\sigma^{\text{impl}} < \sigma^{\text{impl}}{\text{BS}}$.

h = seq(0,17,l=100)
s = sapply(h, function(x){
    equivalent_jump_vola_to_bs(
      bs_vola=0.44, 
      time=1.5, 
      const_short_rate=0.02,
      const_default_intensity=x/100)
    })
plot(h, s,
     ylim=c(min(s, na.rm=T), 0.47),
     ylab='Default-free Vola', xlab='Default Intensity (%)', type='l',
     main='Defaultable Volatility Equivalent to\n44% Black-Scholes Volatility')
abline(h=0.44, col='lightgray', lty='dashed')
points(0,0.44,pch=19)

It is well-known that $\sigma^{\text{impl}}_{\text{BS}}$ exhibits skew, in the sense that, empirically, its value is not constant across all options of the same tenor, particularly as we vary the option strike. This can be interpreted as evidence that the Black-Scholes stochastic model is "wrong", which is of course quite reasonable from our point of view, where we favor a different stochastic model.

vrx_stock = 33.38
libor = 0.53
trading_days = 44; calendar_years = 2/12
def_probs=c(0,0.01, 0.02, 0.03, 0.05, 0.07, 0.1, 0.15, 0.2)
vrx_put_bids = c(0.02, 0.06, 0.11, 0.20, 0.34, 0.52, 0.84, 1.30, 1.80, 2.46, 3.25, 4.20, 5.45, 6.60, 8.25, 9.85)
vrx_put_asks = c(0.04, 0.09, 0.19, 0.21, 0.41, 0.63, 0.95, 1.39, 1.86, 2.50, 3.40, 4.35, 5.60, 7.30, 8.95, 10.90)
vrx_put_strikes = seq(2.5,40,by=2.5)
T = trading_days / 252; r = -log(exp(-(libor)*(calendar_years)))/T;
bs_intensity_skew = function(Y){sapply(1:length(vrx_put_asks),
                                       function(X){
                                         implied_volatility(vrx_put_asks[[X]], PUT, vrx_stock, vrx_put_strikes[[X]],
                                                            r, T, default_intensity=Y)
                                         }
                                       )}

unlinked_skews = data.frame(default_intensity=rep(def_probs, each=length(vrx_put_asks)),
                            strike=rep(vrx_put_strikes, times=length(def_probs)),
                            bid=rep(vrx_put_bids, times=length(def_probs)),
                            ask=rep(vrx_put_asks, times=length(def_probs)))
unlinked_skews$bid_volatility = sapply(1:(length(vrx_put_bids)*length(def_probs)),
                                       function(X){
                                         100*implied_volatility(unlinked_skews$bid[[X]], PUT, vrx_stock,
                                                            unlinked_skews$strike[[X]], r, T,
                                                            unlinked_skews$default_intensity[[X]],
                                                            max_vola = 1000)
                                         })
unlinked_skews$ask_volatility = sapply(1:(length(vrx_put_asks)*length(def_probs)),
                                       function(X){
                                         100*implied_volatility(unlinked_skews$ask[[X]], PUT, vrx_stock,
                                                            unlinked_skews$strike[[X]], r, T,
                                                            unlinked_skews$default_intensity[[X]],
                                                            max_vola = 1000)
                                       })
unlinked_skews$default_intensity = unlinked_skews$default_intensity * 100
unlinked_skews$mid_volatility = 0.5*(unlinked_skews$bid_volatility+unlinked_skews$ask_volatility)


ggplot(data=unlinked_skews, aes(y=mid_volatility,x=strike,color=as.factor(default_intensity)))+geom_line()+labs(title="Valeant Default-Free Volatility Skew\nJune 2016 Options As Of April 18 2016",x='Strike',y='Mid Volatility (%)')+labs(color="Def. Intensity (%)") +geom_vline(xintercept=vrx_stock, color='darkgray') + annotate("text",x=vrx_stock-0.75,y=180,label='VRX=33.38',angle=90,size=4,color='darkgray')

If we price options of various strikes $K$ in our model using a constant $\sigma$, and then compute the corresponding volatility skew curve of $\sigma^{\text{impl}}_{\text{BS}}(K)$, we can see how much of the empirical volatility skew can be explained by equity-linked jump to default^[We do not argue here that equity-linked default intensity is truly the most important contributor to volatility skew. Skew at short tenors is associated with other kinds of price jumps, and at long tenors is associated with stochastic volatility.]. To the extent the curves our similar, we regard our model as pricing all derivatives in a manner logically consistent with empirical volatility skew.

  skew_for_power = function(p,h=0.05,strikes=c(50,100,200)) {
    library(ragtop)
    library(limSolve)
    library(stringr)
    library(MASS)
    pct2 = function(T,t){exp(-0.02*(T-t))}
    instruments = lapply(strikes,
                         function(x) {
                           EuropeanOption(maturity=3.53, strike=x, callput=PUT,
                                          discount_factor_fcn=pct2,
                                          name=paste0("P", str_pad(x,3,pad="0")))
                         }
    )
    names(instruments) = lapply(instruments, function(x){x$name})
    default_intensity_fcn= function(t, S, ...){0.95*h+0.05*h*(100/S)^p}
    pvs = find_present_value(S0=100, num_time_steps=250, instruments=instruments,
                             const_volatility=0.5, discount_factor_fcn=pct2,
                             default_intensity_fcn=default_intensity_fcn ,
                             structure_constant=2.0,
                             std_devs_width=3.0)
    vols = implied_volatilities(pvs, -1, 100, strikes, 0.02, 3.53)
    vols
  }

  powers = c(0, 1, 2, 3, 5, 7, 10)
  strikes=c(seq(10,49,by=1),seq(50,102,by=2),seq(105,155,by=5),seq(160,300,by=20))

  cl = parallel::makeCluster(max(1, parallel::detectCores()-1))
  parallel::clusterExport(cl,'skew_for_power')
  parallel::clusterExport(cl,'strikes')
  skews = as.data.frame(parallel::parSapply(cl,powers, function(p){skew_for_power(p,strikes=strikes)}))
  parallel::stopCluster(cl)

  names(skews) = as.character(MASS::fractions(powers))
  skews$strike = strikes
  long_skews = melt(skews,id.vars='strike', variable.name='Power', value.name='Volatility')
  pct_skews = long_skews
  pct_skews$Volatility =100*pct_skews$Volatility

  pct_skews$Power = as.factor(pct_skews$Power)
  ggplot(pct_skews, aes(x=strike,y=Volatility, color=Power)) +
    geom_line() + geom_vline(xintercept=100, color='darkgray') +
    labs(title="Volatility Skews By Default Intensity Power\n5% Initial Default Intensity",
         color="Def. Int. Power",
         x="Strike", y="Volatility (%)")

Convertible bonds often have tenors far exceeding those found in liquid option markets. This makes historical volatility estimates^[Historical volatility estimates are usually formed by computing the standard deviation of $\log\left(\frac{S^H_{i}}{S^H_{i-1}}\right)$ for a time series of historical prices $S^H$.][@Bur90] more desirable. Since historical prices are by definition default-free, they are compatible with $\sigma$ and not with $\sigma_{\text{BS}}$. This property of historical volatility estimation for risky entities is generally under-appreciated.

Fundamental Derivative Properties

Early Exercise

Some instruments of interest, such as CDS and vanilla options, are amenable to valuation by directly estimating and discounting terminal distributions. However, both convertible bonds and american-exercise equity options give early exercise rights to their holders. In addition, corporate bonds (including converts) often have embedded call options, granting exercise decisions to the issuer. Embedded put options also appear with some frequency.

Note that exercise of a call option is often accompanied by coupon acceleration according to a contractually determined rate curve. This feature is trivial to handle as part of the exercise condition in a model. Call options may also be provisional^[Provisional calls are also known as conditional calls or soft calls.], being exercisable only when the historical stock price path has met certain criteria. This latter condition is substantially more difficult to treat.

Pricing derivatives in the presence of these exercise rights requires dynamic programming in order to solve the corresponding optimal stopping-time problem. This is best performed by forming the companion partial differential equation (PDE), and solving the PDE by discretization methods as presented below.

Tenor $T$

All the derivatives we contemplate here have a tenor or termination date, on which the derivative value will be know exactly. For options and CDS, this is the contract expiration date. For straight and convertible bonds, it is the effective maturity date^[Bonds have an official maturity date that, for tax reasons, is sometimes made irrelevant to bond structure by including calls and puts guaranteed to be exercised by either the issuer or the holder. We also use the term effective maturity date to describe upcoming dates when a call or put is considered so certain that modeling is moot, or when a takeover has been announced.].

Recovery Value $\delta$

In default, equity holders usually receive a negligible award after court proceedings. However, debt holders often receive substantial payments, whose anticipated size must affect our view of the present value of that debt and any derivatives on it (such as CDS). Therefore, we specify a function $\delta$, typically a constant, to represent post-default value. As with historical defaults, historical observations of $\delta$ are distressingly rare, so it is usually chosen according to some long-term average.

Representation As A Partial Differential Equation

Recalling that short rate $r(t)$, holding cost $q(t)$ and volatility $\sigma(t)$ are deterministic, we can apply the usual arguments [@Kar91,@And04] and the Feynman-Kac theorem to our SDE to obtain a PDE satisfied by any derivative of $S$ that lacks cashflows $$ {{\frac{\partial \mspace{-1.0mu} V}{\partial \mspace{-1.0mu} t}}} -rV + h(\delta-V) + \left(r-q+h\right)S{{\frac{\partial \mspace{-1.0mu} V}{\partial \mspace{-1.0mu} S}}}+\frac12 \sigma^2 S^2 {{\frac{\partial^2 \mspace{-1.0mu} V}{\partial \mspace{-1.0mu} S^2}}}=0. $$

This backward parabolic PDE is amenable to numerical integration, which we specify in the final section below. For now, let us take it as given that for any derivative with known boundary conditions we are able to form solutions $v^{(m)}_n$ on a grid of times $t^{(m)}, m=0,\dots,M$ and stock prices $S_n, n=-N,\dots,N$. The present value of our derivative is represented by the entry $v^{(M)}_0$.

Boundary Conditions

As discussed above, every derivative provides us with terminal values at expiration or maturity date. This provides us with a vector of values $\vec{v}^{(0)}$ to use as an initial boundary condition. For boundary conditions at the edge of our solution space, where $S \rightarrow 0$ or $S \rightarrow \infty$ we have a choice of boundary conditions, some wiser than others.

Coupons and Dividends

In practice, both bond prices and stock prices are expected to exhibit discontinuities. Bonds yield occasional cashflows in the form of coupons. In order to preserve continuity of value across coupon dates, we simply assume that coupons remain part of the contract, accumulating value at the short rate $r$ and enjoying full recovery in case of default.

Stock dividends are somewhat trickier to handle. Derivative prices should not themselves jump across the expected discontinuity in stock price from time $T_d^-$ just before the dividend to time $T_d^+$ just after it. We need to adjust the grid's relationship between underlying prices and derivative price accordingly, so we use the classic technique of rebasing the grid[@Tav00]. That is to say, we take an interpolation function $\mathcal{I}$ and set $$ V\left(\vec{S}, T^{-}_{d} \right) = \mathcal{I}\left( \vec{S} - \kappa - \pi \vec{S} ; \, \, \vec{S}, \vec{V}\left(\vec{S}, T_d^+\right)\right). $$

Dividend Protection

Dividend protection terms alter the conversion ratio in proportion to dividend sizes exceeding some contractual mechanism, and are fairly common. It is easy to demonstrate that dividend protected bonds can be priced accurately using our usual approach, with dividends respecified never to exceed the protection level. Note however, that this trick would affect all securities priced on the grid, even ones without dividend protection, so we must be sure to restrict our application of the ceiling only to protected instruments.

Full Calibration

We are now ready to discuss the entire calibration-and-pricing cycle.

Dividends, Rates And Borrow Costs

As discussed above, we use exogenous information sources to decide on these rates. It is worth noting that we can attain extra computational efficiency at negligible cost by assuming dividends eventually take on the form of a continuous yield rather than discrete payments, say after 3 years.

Volatility

If we are using option prices to estimate volatility, a reasonable approach is to begin with a degenerate version of our model in which $h=\tilde{h}$ is a function only of time. Given an at-the-money option of sufficiently long maturity, we then use a root-finder and our PDE solver to estimate $\sigma$.

Local Default Intensity

Begin by choosing a reasonable simple functional form for local hazard rates $h(S,t)$ with one free parameter $\mu(t)$. Take a grid $T^{(m)}$ of future times for we wish to find parameter values $\mu^{(m)}$. Zero coupon bonds at these maturities have prices $$ Z(T^{(m)})=B(0,T^{(m)})\exp{\left(-\int_0^{T^{(m)}} \tilde{h}(s) ds \right)}. $$ Assume we have already worked out $\mu^{(\ell)}$ for all $\ell<m$. Then (again using a root-finder) we choose $\mu^{(m)}$ such that our PDE properly prices $Z(T^{(m)})$.

Pricing

We have now fully specified all parameters of our SDE, and are therefore ready to use the PDE solver to find prices and hedge ratios of portfolios of various derivative securities, including convertible bonds CDS, options and so on.

p=6.0; h=0.10
pct4 = function(T,t=0) {
  exp(-0.04*(T-t))
}
cb = ConvertibleBond(conversion_ratio=3.5, maturity=1.5, notional=100,
                     discount_factor_fcn=pct4, name='Convertible')
S0 = 10
default_intensity_fcn = function(t, S, ...){0.9*h+0.1*h*(S0/S)^p}
if (T) {
  cbgrid = as.data.frame(form_present_value_grid(S0=S0, grid_center=S0,
                                                 instruments=list(Convertible=cb),
                                                 num_time_steps=250,
                                                 default_intensity_fcn=default_intensity_fcn,
                                                 const_volatility = 0.4, discount_factor_fcn=pct4,
                                                 std_devs_width=5))
  unlinkedgrid = as.data.frame(form_present_value_grid(S0=S0, grid_center=S0,
                                                 instruments=list(Unlinked=cb),
                                                 num_time_steps=250,
                                                 const_default_intensity = h,
                                                 const_volatility = 0.4, discount_factor_fcn=pct4,
                                                 std_devs_width=5))
}
present_value = spline(x=cbgrid[,"Underlying"],
                       y=cbgrid[,"Convertible"],
                       xout=S0)$y
straight_bond_value_no_link = cb$notional * exp(-h*cb$maturity) * cb$discount_factor_fcn(cb$maturity,0)

golden_ratio = 0.5*(1+sqrt(5))
target_aspect_ratio = (3.5*S0)/(golden_ratio*cb$notional*1.5)
p = (
  ggplot(cbgrid,
         aes(x=Underlying,y=Convertible)) +
    geom_line(mapping=aes(x=Underlying, y=Unlinked), data=unlinkedgrid, color="green") +
    coord_fixed(ratio=target_aspect_ratio) +
    geom_line() +
    scale_y_continuous(limits=c(0,cb$notional*1.5)) +
    scale_x_continuous(limits=c(0,3.5*S0)) +
    geom_abline(slope=cb$conversion_ratio, color="darkgray", linetype='dotted') +
    annotate("text", x=S0, y=S0*cb$conversion_ratio-5, label="Conversion Value",
             angle=180*atan(cb$conversion_ratio*target_aspect_ratio)/pi,
             size=3, color="darkgray")+
    geom_hline(yintercept=straight_bond_value_no_link,
               color="darkgray", linetype='dashed') +
    annotate("text", x=S0*3, y=straight_bond_value_no_link-5, label="Straight Bond Unlinked",
             size=3, color="darkgray")  +
    geom_point(aes(x=S0,y=present_value), color="red") +
    labs(title="Convertible Bond Value, Equity-Linked Default Model")
)

print(p)

Layers

Though mathematically we express our pricing algorithm in terms of grid values for a single security, in practice we configure our grid so as to price multiple instruments simultaneously. This saves some time on some of the computations common to all of the instruments, for example calibration of the local default intensity parameter $\mu^m$, while also providing simultaneous access to all the instrument prices.

Contract Differencing

Similarly to the way in which control variates are use to improve the constant factor in Monte Carlo estimation error, we can use contract differencing to improve price accuracy for options priced on grids. Consider, for example, pricing an American option and its European equivalent on the grid, obtaining present values $V_A$ and $V_E$. We also enjoy a closed-form variation of the Black-Scholes formula $BS()$ for pricing European options, so we know the magnitude and direction of error our grid made for the European option to be $$ \text{Err}_E = BS(E,\dots) - V_E. $$ Assuming $\text{Err}_A \approx \text{Err}_E$ we can then form a newer, more accurate estimate of American option value as $$ \hat{V}_A = V_A - \text{Err}_E $$

Exotics

Layers also provide us the ability to price certain exotics, including weakly path-dependent options and compound options. Though the latter only rarely arise as equity exotics, they occur rather frequently as "greenshoe" or over-allotment options in the terms and conditions of convertible bond issues.

To price a greenshoe, we simply use one layer for the bond itself, and then compute greenshoe value by assuming exercise into the convertible at the greenshoe tenor, when advantageous.

Solving The PDE

Change of Variables

For greater stability and superior computational tractability, we change variables $$ \begin{aligned} \tau &= T-t \ z &= \log(S/K) - (\bar{r}-\bar{q}-\frac12 \bar{\sigma}^2 ) \tau\ f(z,\tau) &= e^{-(\bar{r})t(\tau)}V( S(z),t(\tau) ) \end{aligned} $$ using an arbitrary constant $K$ on a scale roughly similar to stock price $S$ (often either a contracted strike or the current stock price $S_0$).

Here the bar operator indicates time averages, as in $$ \bar{r} = \frac1t \int_0^{t(\tau)}r(s)ds $$ and $$ \bar{\sigma}^2 = \frac1t \int_0^{t(\tau)}\sigma^2(s)ds . $$

This transforms our PDE into the more tractable equivalent $$ {{\frac{\partial \mspace{-1.0mu} f}{\partial \mspace{-1.0mu} \tau}}} + h{{\frac{\partial \mspace{-1.0mu} f}{\partial \mspace{-1.0mu} z}}}+\frac12 \sigma^2 {{\frac{\partial^2 \mspace{-1.0mu} f}{\partial \mspace{-1.0mu} z^2}}}=0. $$

We elect to enforce Neumann boundary conditions, so we take $$ \lim_{z \rightarrow \pm \infty} {{\frac{\partial^2 \mspace{-1.0mu} f}{\partial \mspace{-1.0mu} z^2}}} = 0 $$

Note that the Neumann boundary conditions would be more or less exact at high-$S$ boundaries for a solver operating in $S$ space. For our solver operating in $z$ space, the Neumann boundary conditions lack the (exponential) curvature. Nevertheless for grids of reasonable size they work very well.

Finite Differencing the PDE

Assume we have $\vec{f}^{m-1}$, a set of solutions to the PDE for some $\tau=(m-1)dt$ at a given set of grid points $z_n = z_0+n\,dz, n=-N,\dots,N$. For convenience we may write this in alternate forms $$ \vec{f}^{m-1} = f(\left{z_n\right}, \tau_{m-1}) = f(\vec{z}, \tau_{m-1}) $$

We wish to find approximate values for $\vec{f}^{m}$. Assume for now we are not subject to exercise boundaries or cashflows. An implicit finite difference approximation to our PDE will then provide $\vec{f}^{m}$ as the solution to the linear set of equations $$ \vec{f}^{m-1} = C \cdot \vec{f}^{m} \otimes \vec{p} $$ for some matrix $C$, where $\vec{p}$ is the vector of survival probabilities $$ p_n = \exp\left( -h(S_n, \tau_m) dt \right). $$ and $\otimes$ indicates the Hadamard product.

Our Neumann boundary conditions ensure $C$ is tridiagonal. Except at boundaries, its diagonal elements take the form $$ c_{n,n} = 1 + \sigma^2 \frac{dt}{dz^2}, $$ the superdiagonal elements are $$ c_{n-1,n} = \frac12 \left( +h \frac{dt}{dz} - \sigma^2 \frac{dt}{dz^2}\right), $$ and the subdiagonals are $$ c_{n,n-1} = \frac12 \left(-h \frac{dt}{dz} - \sigma^2 \frac{dt}{dz^2}\right). $$

At the boundaries, we can ignore the second derivative but the convection terms remain, yielding $$ c_{\pm N,\pm N} = h \frac{dt}{dz} $$ and $$ c_{ -N, -N+1} = c_{ N-1,N} = -h \frac{dt}{dz}. $$

Having found all these elements, we can obtain our updated grid values using the tridiagonal algorithm.

Exercise Boundaries

When put and/or call conditions are active, we need to know their exercise boundaries, i.e. stock prices and associated derivative values above or below which exercise is optimal for the option holder. This issue of determining the exercise boundary by dynamic programming is indeed the reason we are using a grid in the first place.

Say exercise values are known on the grid to be $\epsilon_n^m = \epsilon(S_n, t(\tau^{(m)}))$. In theory, we should be solving the linear complementarity problem $$ \left( \vec{f}^{m-1} - C \cdot \vec{f}^{m} \otimes \vec{p}\right) \cdot\left( \vec{f}^{m-1} - \vec{\epsilon}^m\right) = 0 $$ using Gauss-Siedel or policy iteration.

In practice, however, we introduce negligible errors by simply assuming the exercise barrier can be determined directly from $\vec{\epsilon}^m$ and $C^{-1} \cdot \vec{f}^{m-1} \otimes \vec{p}^{-1}$, defining our timestep solution as the element-wise maximum $$ \vec{f}^{m} = \max\left( \vec{\epsilon}^m, \quad C^{-1} \cdot \vec{f}^{m-1} \otimes \vec{p}^{-1} \right). $$

Choosing Grid Parameters

Our implicit scheme has truncation errors leading to estimation errors of size $O(dt) + O(dz^2)$. It is unconditionally stable[@Smi86], but in practice begins to perform poorly when convection terms become large compared to the diagonal, which is to say when $$ h \frac{dt}{dz} \sim 1 + \sigma^2 \frac{dt}{dz^2} $$ or when dominance of the diagonal grows weak $$ \sigma^2 \frac{dt}{dz^2} \gg 1 . $$

For practical reasons, we want to fully specify our grid without being terribly concerned about the functional form of these constraints. We choose a grid width $w$ in terms of standard deviations of $S$, usually about 2 or 3, and a structure constant $y = \frac{dt}{dz^2}$ between $\frac14$ and $4$, and then typically leave these parameters alone.

This leaves us with just one free parameter, namely the timestep count $M$. From this parameter we easily extract $dt$ and $dz$ from the structure constant $y$ and the relation $dt = T/M$. Though we can't quite form a closed-form estimate of the $w$-standard-deviation increase in stock price, we can reasonably estimate it with $$ S_{\text{max}} = Q_{\Phi(w)}(S_T) = \frac{S_0}{B(0,t)} e^{\tilde{h}T} e^{w \sigma \sqrt{T}}. $$ Using our change of variable above we can then find $z_{\text{max}} = z\left(S_{\text{max}}\right)$ and then divide by $dz$ to find $N$.

Anchoring Timesteps

Coupons, put dates, call dates and dividends occur on dates known in advance. These are sufficiently sparse that it is easier to force "extra" timesteps into our $\tau^{(m)}$ grid than it is to discount them to timestep boundaries in a manner consistent with the rest of the model. Once we have set our dividend date at a timestep boundary, we deal with dividends on the grid using the interpolation scheme described above.

Calibration

Although in principle we could guess at a reasonable form for equity-linked default intensity, it makes more sense to allow the derivatives markets themselves to give us a hint. In particular, at any given time there are several hundred actively quoted options on TSLA. We take a snapshot of these, along with a risk-free yield curve, and attempt to calibrate our model.

Fitting Continuous Process Volatilities

To begin with, we probably want our model to provide a very tight match to at-the-money (ATM) volatilities. Let's say we have decided that we must precisely replicate $K$ such volatilities, or equivalently the prices of $K$ options $O_k, k=1,\dots,K$, in order of (strictly increasing) tenor $T_k$. Assume we have chosen a default intensity function $h(S_t,t)$. We can set the volatility $\sigma(t)$ of our continuous process to be constant $\sigma_1$ from time zero to $T_1$ by applying a root-finder to the pricing algorithm.

Once we have established continuous process volatilities to tenor $T_k$, we can extend to $T_{k+1}$ by applying the root finder to the pricing algorithm versus the constant volatility $\sigma_{k+1}$ between $T_k$ and $T_{k+1}$. Overall, we are able to bootstrap our way into a full variance accumulation function for the continuous process whose associated volatilities are piecewise constant.

Fitting Linked Default Intensity

Now let us choose a functional form for the default intensity function $h(S_t,t)$, with $M$ parameters $\mu_m,m=1,\dots,M$. We run a nonlinear optimizer on them, with an objective function that first fits the variance accumulation function for the continuous process given our particular choice of the $\mu_m$, and then prices all the available market options via our pricing algorithm. Once we have those prices, it is a simple matter to construct a weighted average pricing error as our objective function value.

For best stability and fit quality, it is best to volatility space. That is, we convert every price our algorithm generates into a straight Black-Scholes implied volatility. This helps the optimization algorithm deal with what would otherwise be tricky variations of scale in the objective function components. Note that the conversion to Black-Scholes volatilities is, for these purposes, simply an anonymous nonlinear transform of the prices into a space more amenable to fitting.

Calibration Results

Our optimizer yields a set of parameters with the best available weighted average price error, along with a variance cumulation function that precisely replicates the ATM volatilities. We can then use it for analyzing other instruments, or for finding interesting apparent anomalies in the set of options used for calibration.

Example

Consider the problem of pricing a convertible bond de novo. For concreteness, we take the case of Tesla Motor (TSLA), which issued a 0.25% bond expiring in 2019, and assume we would like to price it. We have a set of option prices, including

knitr::kable(TSLAMarket$options[c(200,300,400,500,600, 800),], digits=3)

Running a crude optimization algorithm yields a term structure of continuous process volatilities

cont_proc_vols = data.frame(tenor=c(0.126598173515982, 0.375913242009132, 0.625228310502283, 
                                    0.721118721461187, 1.71837899543379), 
                            volatility=c(0.475587493104041, 0.445656338069887, 
                                         0.441157031667267, 0.430077201499387, 0.413223140495868))
knitr::kable(cont_proc_vols, digits=3)

along with a default intensity function

$$ h(S_t,t) = 0.05 \left( 0.077 +(1-0.077) \left( \frac{S_0}{S} \right)^{1.02} \right) $$

We can then price our convertible bond, finding that it has present value 94.87 percent of par, which is remarkably close to the contemporaneous market quotes.

Performance

Analyzing performance of our basic PDE solver in simple cases shows that convergence is slightly superlinear in timestep size for the simplest default intensity processes

``` {r plot_step_size_convergence, cache=TRUE, fig.width=6, fig.height=4} euro_zero_strike_call = EuropeanOption(maturity=3.53, strike=0, callput=1, name="CallStrike0")

eqcall = function(n) { find_present_value(S0=100,instruments=list(c=euro_zero_strike_call),num_time_steps=n)$c }

s=c(3,5,15,25,50,100,150,200,300,400,500,600,750,1000) cs = sapply(s,eqcall)/100 es = (cs - cs[[length(cs)]])[1:(length(s)-1)] num_steps = s[1:length(es)] conv.rate.model = lm(log(es)~log(num_steps))

plot(s, (cs - cs[[length(cs)]]), log='xy', col='orange', pch=19, xlab='Step Count (log scale)', ylab='Relative Error (log scale)', main='Polynomial Convergence In Step Size' ) lines(num_steps, exp(predict(conv.rate.model, newdata=list(x=num_steps))), col="gray", lty=4)

Grid width is also important for most securities.  Here, the simple case boasts superlinear convergence in width due to the narrow tails of the gaussian distribution.

``` {r plot_width_convergence, cache=TRUE, fig.width=6, fig.height=4}
euro_zero_strike_call = EuropeanOption(maturity=3.53, strike=0, callput=1, name="CallStrike0")

eqwcall =  function(w) {
  find_present_value(S0=100,instruments=list(c=euro_zero_strike_call),
                     num_time_steps=200, std_devs_width=w)$c
}

w=c(0.25,0.5,0.75,1,1.5,2,2.5,3,3.5,4,4.5,5,6)
ws = sapply(w,eqwcall)/100
werr = (ws[[length(ws)]]-ws)[1:(length(w)-1)]
widths = w[1:length(werr)]

plot(widths, werr, log='y',
     col='darkgreen',pch=19,
     xlab='Width In Standard Deviations',
     ylab='Relative Error (log scale)',
     main='Convergence In Grid Width'
)

In the case of more complex equity-linked default, and a coupon-bearing convertible bond, we observe relative errors about twice as large, though we still enjoy superlinear convergence.

{r plot_cbond_convergence, cache=TRUE, fig.width=6, fig.height=4} h = 0.1; p = 6; pct2 = function(T,t=0) { exp(-0.02*(T-t)) } default_intensity_fcn = function(t, S, ...){0.9*h+0.1*h*(S0/S)^p} coup_times = seq(0.4, 4.4, by=0.5) maturity = coup_times[[length(coup_times)]] cb = ConvertibleBond(conversion_ratio=2.5, maturity=maturity, notional=100, coupons=data.frame( payment_time=coup_times, payment_size=1 + 0 * coup_times), discount_factor_fcn=pct2, name='CB') S0 = cb$notional / cb$conversion_ratio cbprice = function(n) { find_present_value(S0 = S0, instruments = list(CB=cb), num_time_steps = n, default_intensity_fcn = default_intensity_fcn, discount_factor_fcn = pct2, std_devs_width=4, )$CB } s = c(3, 5, 8, 15, 25, 50, 100, 150, 200, 300, 400, 500, 600, 750, 1000) cs = sapply(s, cbprice) es = ((cs - cs[[length(cs)]])[1:(length(cs)-1)])/cs[[1]] num_steps = s[1:length(es)] conv.rate.model = lm(log(es)~log(num_steps)) plot(num_steps, es, log='xy', col='purple', pch=19, xlab='Step Count (log scale)', ylab='Relative Error (log scale)', main='Convertible Bond, Equity-Linked Default' ) text(x=num_steps[[2]], y=es[[length(cs)-3]], labels=expression(h==0.09 + 0.01*frac(100^6,S^6)), pos=4) lines(num_steps, exp(predict(conv.rate.model, newdata=list(x=num_steps))), col="gray", lty=4)

These are fairly typical convergence rates for derivatives grid pricing schemes.

References



brianboonstra/ragtop documentation built on March 7, 2020, 2:23 p.m.