sp500: S&P 500 Daily Log Returns and Corresponding Dates

sp500R Documentation

S&P 500 Daily Log Returns and Corresponding Dates

Description

This dataset contains daily log returns for 186 stocks in the S&P 500 index from February 6, 2004, to March 2, 2016. The daily log returns are calculated using the adjusted daily closing prices. The dataset also contains the corresponding dates for each log return.

Usage

data(sp500)

Format

A list with two elements:

sp500$log_daily_return

A matrix with dimensions 3037 (rows, trading days) by 186 (columns, stocks).

sp500$date

A vector of length 3037, containing the dates for each trading day.

Details

The dataset is provided as an .RData file containing:

  • sp500$log_daily_return: A matrix of daily log returns with 3037 rows (trading days) and 186 columns (stocks).

  • sp500$date: A vector of length 3037 containing the dates for each daily log return.

Source

Data from the S&P 500 stock index (2004-2016).

See Also

VAR_cpDetect_Online, get_cps

Examples

# Example Usage: Applying Change Point Detection to S&P 500 Data
# This is an example of how to apply the change point detection method
# (using the VAR_cpDetect_Online function) on the daily log return
# dataset from the S&P 500 (stored in the sp500 dataset). The code
# below calculates the average return volatility for all stocks, applies
# the change point detection algorithm, and plots the results with detected
# change points shown as vertical red and black lines.

# Load the dataset
data(sp500)

# Set parameters
library(ggplot2)
set.seed(2024)
n_sp <- nrow(sp500$log_daily_return)
p_sp <- ncol(sp500$log_daily_return)

# Calculate average return volatility for all data points
volatility_sum <- rep(0, (n_sp - 21))
for(col in 1:p_sp){
  temp <- as.numeric(sp500$log_daily_return[, col])
  temp1 <- rep(0, (n_sp - 21))
  for(row in 1:(n_sp - 21)){
    temp1[row] <- sd(temp[(row):(row + 21)])
  }
  volatility_sum <- volatility_sum + temp1
}
volatility_ave <- volatility_sum / p_sp

# Apply change point detection method
n0 <- 200
w <- 22
alpha <- 1 / 5000

res <- VAR_cpDetect_Online(t(sp500$log_daily_return), n0, w, alpha, 1, FALSE, TRUE, 5 / w, TRUE)
res_sp <- res$alarm_locations + n0
res_sp_cps <- res$cp_locations + n0
# Get the estimated starting points of each alarm cluster
cps_est_sp <- unique(res_sp_cps[which(res_sp %in% get_cps(res_sp, w))])

# Prepare data for plotting
y_values <- c(volatility_ave)
x_values <- sp500$date[1:(n_sp - 21)]
df <- data.frame(y_values, x_values)
plot_sp <- ggplot(df, aes(y = y_values, x = x_values)) +
  geom_line() +
  theme(legend.position = "none") +
  labs(title = "", x = "", y = "") +
  scale_x_date(date_breaks = "1 year", date_labels = "%Y") +
  geom_vline(xintercept = sp500$date[res_sp], linetype = "solid", color = "red", alpha = .1) +
  geom_vline(xintercept = sp500$date[cps_est_sp], linetype = "solid", color = "black")

# Print the detected change points
sp500$date[cps_est_sp] # The dates for the starting of the alarm clusters
plot_sp


VARcpDetectOnline documentation built on April 12, 2025, 1:44 a.m.