vpin_measures | R Documentation |
Estimates the Volume-Synchronized Probability of Informed
Trading as developed in \insertCiteEasley2011;textualPINstimation
and \insertCiteEasley2012;textualPINstimation.
Estimates the improved Volume-Synchronized Probability of Informed
Trading as developed in \insertCiteke2017improved;textualPINstimation.
vpin(
data,
timebarsize = 60,
buckets = 50,
samplength = 50,
tradinghours = 24,
verbose = TRUE
)
ivpin(
data,
timebarsize = 60,
buckets = 50,
samplength = 50,
tradinghours = 24,
grid_size = 5,
verbose = TRUE
)
data |
A dataframe with 3 variables:
|
timebarsize |
An integer referring to the size of timebars
in seconds. The default value is |
buckets |
An integer referring to the number of buckets in a
daily average volume. The default value is |
samplength |
An integer referring to the sample length
or the window size used to calculate the |
tradinghours |
An integer referring to the length of daily
trading sessions in hours. The default value is |
verbose |
A logical variable that determines whether detailed
information about the steps of the estimation of the VPIN (IVPIN) model is
displayed. No output is produced when |
grid_size |
An integer between |
The dataframe data should contain at least three variables. Only the
first three variables will be considered and in the following order
{timestamp, price, volume}
.
The argument timebarsize
is in seconds enabling the user to implement
shorter than 1
minute intervals. The default value is set to 1
minute
(60
seconds) following Easley et al. (2011, 2012).
The argument tradinghours
is used to correct the duration per
bucket if the market trading session does not cover a full day (24 hours)
.
The duration of a given bucket is the difference between the
timestamp of the last trade endtime
and the timestamp of the first trade
stime
in the bucket. If the first and last trades in a bucket occur
on different days, and the market trading session is shorter than
24 hours
, the bucket's duration will be inflated. For example, if the daily
trading session is 8 hours (tradinghours = 8)
, and the start time of a
bucket is 2018-10-12 17:06:40
and its end time is
2018-10-13 09:36:00
, the straightforward calculation gives a duration
of 59,360 secs
. However, this duration includes 16 hours when the
market is closed. The corrected duration considers only the market activity
time: duration = 59,360 - 16 * 3600 = 1,760 secs
, approximately
30 minutes
.
The argument grid_size
determines the size of the grid for the variables
alpha
and delta
, used to generate the initial parameter sets
that prime the maximum-likelihood estimation step of the
algorithm by \insertCiteke2017improved;textualPINstimation for estimating
IVPIN
. If grid_size
is set to a value m
, the algorithm creates a
sequence starting from 1 / (2m)
and ending at 1 - 1 / (2m)
, with a
step of 1 / m
. The default value of 5
corresponds to the grid size used by
\insertCiteYan2012;textualPINstimation, where the sequence starts at
0.1 = 1 / (2 * 5)
and ends at 0.9 = 1 - 1 / (2 * 5)
with a step of 0.2 = 1 / 5
. Increasing the value of grid_size
increases the running time and may marginally improve the accuracy of the
IVPIN estimates
Returns an object of class estimate.vpin
, which
contains the following slots:
@improved
A logical variable that takes the value FALSE
when the classical VPIN model is estimated (using vpin()
), and TRUE
when the improved VPIN model is estimated (using ivpin()
).
@bucketdata
A data frame created as in \insertCiteabad2012;textualPINstimation.
@vpin
A vector of VPIN values.
@ivpin
A vector of IVPIN values, which remains empty when
the function vpin()
is called.
# The package includes a preloaded dataset called 'hfdata'.
# This dataset is an artificially created high-frequency trading data
# containing 100,000 trades and five variables: 'timestamp', 'price',
# 'volume', 'bid', and 'ask'. For more information, type ?hfdata.
xdata <- hfdata
### Estimation of the VPIN model ###
# Estimate the VPIN model using the following parameters:
# - timebarsize: 5 minutes (300 seconds)
# - buckets: 50 buckets per average daily volume
# - samplength: 250 for the VPIN calculation
estimate <- vpin(xdata, timebarsize = 300, buckets = 50, samplength = 250)
# Display a description of the VPIN estimate
show(estimate)
# Display the parameters of the VPIN estimates
show(estimate@parameters)
# Display the summary statistics of the VPIN vector
summary(estimate@vpin)
# Store the computed data of the different buckets in a dataframe 'buckets'
# and display the first 10 rows of the dataframe.
buckets <- estimate@bucketdata
show(head(buckets, 10))
# Display the first 10 rows of the dataframe 'dayvpin'.
dayvpin <- estimate@dailyvpin
show(head(dayvpin, 10))
### Estimation of the IVPIN model ###
# Estimate the IVPIN model using the same parameters as above.
# The grid_size parameter is unspecified and will default to 5.
iestimate <- ivpin(xdata, timebarsize = 300, samplength = 250, verbose = FALSE)
# Display the summary statistics of the IVPIN vector
summary(iestimate@ivpin)
# The output of ivpin() also contains the VPIN vector in the @vpin slot.
# Plot the VPIN and IVPIN vectors in the same plot using the iestimate object.
# Define the range for the VPIN and IVPIN vectors, removing NAs.
vpin_range <- range(c(iestimate@vpin, iestimate@ivpin), na.rm = TRUE)
# Plot the VPIN vector in blue
plot(iestimate@vpin, type = "l", col = "blue", ylim = vpin_range,
ylab = "Value", xlab = "Bucket", main = "Plot of VPIN and IVPIN")
# Add the IVPIN vector in red
lines(iestimate@ivpin, type = "l", col = "red")
# Add a legend to the plot
legend("topright", legend = c("VPIN", "IVPIN"), col = c("blue", "red"),
lty = 1,
cex = 0.6, # Adjust the text size
x.intersp = 1.2, # Adjust the horizontal spacing
y.intersp = 2, # Adjust the vertical spacing
inset = c(0.05, 0.05)) # Adjust the position slightly
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.