vpin_measures: Estimation of Volume-Synchronized PIN model (vpin) and the...

vpin_measuresR Documentation

Estimation of Volume-Synchronized PIN model (vpin) and the improved volume-synchronized PIN model (ivpin)

Description

Estimates the Volume-Synchronized Probability of Informed Trading as developed in \insertCiteEasley2011;textualPINstimation and \insertCiteEasley2012;textualPINstimation.
Estimates the improved Volume-Synchronized Probability of Informed Trading as developed in \insertCiteke2017improved;textualPINstimation.

Usage

vpin(
  data,
  timebarsize = 60,
  buckets = 50,
  samplength = 50,
  tradinghours = 24,
  verbose = TRUE
)

ivpin(
  data,
  timebarsize = 60,
  buckets = 50,
  samplength = 50,
  tradinghours = 24,
  grid_size = 5,
  verbose = TRUE
)

Arguments

data

A dataframe with 3 variables: {timestamp, price, volume}.

timebarsize

An integer referring to the size of timebars in seconds. The default value is 60.

buckets

An integer referring to the number of buckets in a daily average volume. The default value is 50.

samplength

An integer referring to the sample length or the window size used to calculate the VPIN vector. The default value is 50.

tradinghours

An integer referring to the length of daily trading sessions in hours. The default value is 24.

verbose

A logical variable that determines whether detailed information about the steps of the estimation of the VPIN (IVPIN) model is displayed. No output is produced when verbose is set to FALSE. The default value is TRUE.

grid_size

An integer between 1, and 20; representing the size of the grid used in the estimation of IVPIN. The default value is 5. See more in details.

Details

The dataframe data should contain at least three variables. Only the first three variables will be considered and in the following order {timestamp, price, volume}.

The argument timebarsize is in seconds enabling the user to implement shorter than 1 minute intervals. The default value is set to 1 minute (60 seconds) following Easley et al. (2011, 2012).

The argument tradinghours is used to correct the duration per bucket if the market trading session does not cover a full day (24 hours). The duration of a given bucket is the difference between the timestamp of the last trade endtime and the timestamp of the first trade stime in the bucket. If the first and last trades in a bucket occur on different days, and the market trading session is shorter than ⁠24 hours⁠, the bucket's duration will be inflated. For example, if the daily trading session is 8 hours (tradinghours = 8), and the start time of a bucket is 2018-10-12 17:06:40 and its end time is 2018-10-13 09:36:00, the straightforward calculation gives a duration of 59,360 secs. However, this duration includes 16 hours when the market is closed. The corrected duration considers only the market activity time: duration = 59,360 - 16 * 3600 = 1,760 secs, approximately ⁠30 minutes⁠.

The argument grid_size determines the size of the grid for the variables alpha and delta, used to generate the initial parameter sets that prime the maximum-likelihood estimation step of the algorithm by \insertCiteke2017improved;textualPINstimation for estimating IVPIN. If grid_size is set to a value m, the algorithm creates a sequence starting from 1 / (2m) and ending at 1 - 1 / (2m), with a step of 1 / m. The default value of 5 corresponds to the grid size used by \insertCiteYan2012;textualPINstimation, where the sequence starts at 0.1 = 1 / (2 * 5) and ends at 0.9 = 1 - 1 / (2 * 5) with a step of 0.2 = 1 / 5. Increasing the value of grid_size increases the running time and may marginally improve the accuracy of the IVPIN estimates

Value

Returns an object of class estimate.vpin, which contains the following slots:

@improved

A logical variable that takes the value FALSE when the classical VPIN model is estimated (using vpin()), and TRUE when the improved VPIN model is estimated (using ivpin()).

@bucketdata

A data frame created as in \insertCiteabad2012;textualPINstimation.

@vpin

A vector of VPIN values.

@ivpin

A vector of IVPIN values, which remains empty when the function vpin() is called.

References

\insertAllCited

Examples

# The package includes a preloaded dataset called 'hfdata'.
# This dataset is an artificially created high-frequency trading data
# containing 100,000 trades and five variables: 'timestamp', 'price',
# 'volume', 'bid', and 'ask'. For more information, type ?hfdata.

xdata <- hfdata

### Estimation of the VPIN model ###

# Estimate the VPIN model using the following parameters:
# - timebarsize: 5 minutes (300 seconds)
# - buckets: 50 buckets per average daily volume
# - samplength: 250 for the VPIN calculation

estimate <- vpin(xdata, timebarsize = 300, buckets = 50, samplength = 250)

# Display a description of the VPIN estimate

show(estimate)

# Display the parameters of the VPIN estimates

show(estimate@parameters)

# Display the summary statistics of the VPIN vector
summary(estimate@vpin)

# Store the computed data of the different buckets in a dataframe 'buckets'
# and display the first 10 rows of the dataframe.

buckets <- estimate@bucketdata
show(head(buckets, 10))

# Display the first 10 rows of the dataframe 'dayvpin'.
dayvpin <- estimate@dailyvpin
show(head(dayvpin, 10))


### Estimation of the IVPIN model ###

# Estimate the IVPIN model using the same parameters as above.
# The grid_size parameter is unspecified and will default to 5.

iestimate <- ivpin(xdata, timebarsize = 300, samplength = 250, verbose = FALSE)

# Display the summary statistics of the IVPIN vector
summary(iestimate@ivpin)

# The output of ivpin() also contains the VPIN vector in the @vpin slot.
# Plot the VPIN and IVPIN vectors in the same plot using the iestimate object.

# Define the range for the VPIN and IVPIN vectors, removing NAs.
vpin_range <- range(c(iestimate@vpin, iestimate@ivpin), na.rm = TRUE)

# Plot the VPIN vector in blue
plot(iestimate@vpin, type = "l", col = "blue", ylim = vpin_range,
     ylab = "Value", xlab = "Bucket", main = "Plot of VPIN and IVPIN")

# Add the IVPIN vector in red
lines(iestimate@ivpin, type = "l", col = "red")

# Add a legend to the plot
legend("topright", legend = c("VPIN", "IVPIN"), col = c("blue", "red"),
 lty = 1,
 cex = 0.6,  # Adjust the text size
 x.intersp = 1.2,  # Adjust the horizontal spacing
 y.intersp = 2,  # Adjust the vertical spacing
 inset = c(0.05, 0.05))  # Adjust the position slightly


monty-se/PINstimation documentation built on Oct. 22, 2024, 8:04 p.m.