ycevo: Estimate yield function

View source: R/ycevo.R

ycevoR Documentation

Estimate yield function

Description

[Experimental]

Nonparametric estimation of discount functions and yield curves at given dates, time-to-maturities, and one additional covariate, usually interest rate.

Usage

ycevo(
  data,
  x,
  span_x = 60,
  hx = NULL,
  tau = NULL,
  ht = NULL,
  tau_p = tau,
  htp = NULL,
  cols = NULL,
  ...
)

estimate_yield(
  data,
  xgrid,
  hx,
  tau,
  ht,
  tau_p = tau,
  htp = ht,
  rgrid = NULL,
  hr = NULL,
  interest = NULL,
  cfp_slist = NULL
)

Arguments

data

Data frame; bond data to estimate discount curve from. See ycevo_data() for an example bond data structure. Minimum required columns are qdate, id, price, tupq, and pdint. The columns can be named differently: see cols.

x

Time grids at which the discount curve is evaluated. Should be specified using the same class of object as the quotation date (qdate) column in data.

span_x

Half of the window size, or the distance from the centre x to the maximum (or the minimum) qdate with non-zero weight using the kernel function, measured by the number of regular interval between two consecutive qdate. Ignored if hx is specified. See Details.

hx

Numeric vector of the bandwidth parameter corresponds to each time point x.

tau

Numeric vector that represents time-to-maturities in years where discount function and yield curve will be found for each time point x. See Details.

ht

Numeric vector of the bandwidth parameter corresponding to each time-to-maturities tau. See Details.

tau_p

Numeric vector that represents auxiliary time-to-maturities in years. See Details.

htp

Numeric vector of the bandwidth parameter corresponding to each auxiliary time-to-maturities tau_p. See Details.

cols

<tidy-select> A named list or vector of alternative names of required variables, following the new_name = old_name syntax of the dplyr::rename(), where the new_nam takes one of the five column names required in data. This enables the user to provide data with columns named differently from required.

...

Specification of an additional covariate, taking the form of var = list(grid, bandwidth), where var is the name of the covariate in data, grid is the values at which the yield curve is estimated, similar to x, and bandwidth is the bandwidth parameter corresponding to each of the grid values, similar to hx.

xgrid

Numeric vector of values between 0 and 1. Time grids over the entire time horizon (percentile) of the data at which the discount function is evaluated.

rgrid

(Optional) Numeric vector of interest rate grids in percentage at which the discount curve is evaluated, e.g. 4.03 means at interest rate of 4.03%.

hr

(Optional) Numeric vector of bandwidth parameter in percentage determining the size of the window in the kernel function that corresponds to each interest rate grid ('rgrid').

interest

(Optional) Numeric vector of daily short term interest rates. The length is the same as the number of quotation dates included in the data, i.e. one interest rate per day.

cfp_slist

(Internal) Experienced users only. A list of matrices, generated by the internal function 'get_cfp_slist'.

Details

Suppose that a bond i has a price p_i at time t with a set of cash payments, say c_1, c_2, \ldots, c_m with a set of corresponding discount values d_1, d_2, \ldots, d_m. In the bond pricing literature, the market price of a bond should reflect the discounted value of cash payments. Thus, we want to minimise

(p_i-\sum^m_{j=1}c_j\times d_j)^2.

For the estimation of d_k(k=1, \ldots, m), solving the first order condition yields

(p_i-\sum^m_{j=1}c_j \times d_j)c_k = 0,

and

\hat{d}_k = \frac{p_i c_k}{c_k^2} - \frac{\sum^m_{j=1,k\neq k}c_k c_j d_j}{c_k^2}.

There are challenges: \hat{d}_k depends on all the relevant discount values for the cash payments of the bond. Our model contains random errors and our interest lies in expected value of d(.) where the expected value of errors is zero. d(.) is an infinite-dimensional function not a discrete finite-dimensional vector. Generally, cash payments are made biannually, not dense at all. Moreover, cash payment schedules vary over different bonds.

Let d(\tau, X_t) be the discount function at given covariates X_t (dates x and interest rates rgrid), and given time-to-maturities \tau (tau). y(\tau, X_t) is the yield curve at given covariates X_t (dates xg and interest rates rgrid), and given time-to-maturities \tau (tau).

We pursue the minimum of the following smoothed sample least squares objective function for any smooth function d(.):

Q(d) = \sum^T_{t=1}\sum^n_{i=1}\int\{p_{it}-\sum^{m_{it}}_{j=1}c_{it}(\tau_{ij})d(s_{ij}, x)\}^2 \sum^{m_{it}}_{k=1}\{K_h(s_{ik}-\tau_{ik})ds_{ik}\}K_h(x-X_t)dx,

where a bond i has a price p_i at time t with a set of cash payments c_1, c_2, \ldots, c_m with a set of corresponding discount values d_1, d_2, \ldots, d_m, K_h(.) = K(./h) is the kernel function with a bandwidth parameter h, the first kernel function is the kernel in space with bonds whose maturities s_{ik} are close to the sequence \tau_{ik}, the second kernel function is the kernel in time and in interest rates with x, which are close to the sequence X_t. This means that bonds with similar cash flows, and traded in contiguous days, where the short term interest rates in the market are similar, are combined for the estimation of the discount function at a point in space, in time, and in "interest rates".

The estimator for the discount function over time to maturity and time is

\hat{d}=\arg\min_d Q(d).

This function provides a data frame of the estimated yield and discount rate at each combination of the provided grids. The estimated yield is transformed from the estimated discount rate.

An alternative specification of bandwidth hx is span_x, which provides kernel coverage invariant to the length of data. span_x takes an absolute measure of time depending on the unit of x. The default value is 60. If the data is daily on trading days, i.e., the interval between every two consecutive qdate is one trading day, then the window of the kernel function allows the estimation at each point x to contain information from 60 trading days prior to and after the time point x.

For more information on the estimation method, please refer to References.

Value

A tibble::tibble object of class ycevo with the following columns.

qdate

The time point that user-specified as x. The name of this column will be consistent with the name of the time index column in the data input, if the user choose to provide a data frame with the time index column named differently from qdate with the cols argument.

.est

A nested columns of estimation results containing a tibble::tibble for each qdate. Each tibble contains three columns: tau for the time-to-maturity specified by the user in the tau argument, .disount for the estimated discount function at this time and this time-to-maturity, and .yield for the estimated yield curve.

Functions

  • estimate_yield(): Experienced users only. Yield estimation with interest rate and manually selected bandwidth parameters. Returns a data frame of the yield and discount rate at each combination of the provided grids.

    discount

    Estimated discount rate

    xgrid

    Same as input 'xgrid'

    tau

    Same as input 'tau'

    yield

    Estimated yield

References

Koo, B., La Vecchia, D., & Linton, O. (2021). Estimation of a nonparametric model for bond prices from cross-section and time series information. Journal of Econometrics, 220(2), 562-588.

See Also

augment.ycevo(), autoplot.ycevo()

Examples

# Simulating bond data
bonds <- ycevo_data(n = 10)

# Estimation can take up to 30 seconds
ycevo(bonds, x = lubridate::ymd("2023-03-01"))



FinYang/ycevo documentation built on April 10, 2024, 8:17 a.m.