library(learnr) library(tidyverse) # loads dplyr, ggplot2, and others library(tidyquant) library(fpp2) library(ggpmisc) library(tsfe) tutorial_options(exercise.timelimit = 60,exercise.eval = TRUE) knitr::opts_chunk$set(echo = FALSE,warning = FALSE,message = FALSE) data("ftse350") tickers<-c("GRG","BT.A","ULVR") ftse350 %>% filter(ticker %in% tickers) -> port port_ret <- port %>% filter(variable=="Price") %>% arrange(ticker,date) %>% # ensure data is order by stock and then chronologically group_by(ticker) %>% # this ensures the lag operating only operate with each symbol mutate(log_return=log(value)-log(lag(value))) port_ret_ew <- port_ret %>% group_by(date) %>% summarise(ret=mean(log_return)) port_w <- port %>% select(ticker,date,variable,value) %>% spread(variable,value) port_ret_vw <-port_w %>% group_by(date) %>% mutate(weight=`Market Value`/sum(`Market Value`)) port_ret_vw <-port_ret_vw %>% arrange(ticker,date) %>% group_by(ticker) %>% mutate(log_return=log(Price)-log(lag(Price))) %>% group_by(date) %>% summarise(ret=sum(weight*log_return)) port_ret_both<- port_ret_vw %>% rename(vw_ret=ret) %>% left_join(port_ret_ew %>% rename(ew_ret=ret),by="date") ni_hsales_ts<-ts(ni_hsales$`Total Verified Sales`,start = c(2005,1),frequency = 4) usuk_rate_ts<-ts(usuk_rate$price)
In this tutorial, you will learn how to summarise a table of data, including:
filter()
I've preloaded the packages for this tutorial with
library(tidyverse) # loads dplyr, ggplot2, and others library(tidyquant) library(ggpmisc) library(fpp2)
In the
tsfe
pacakge open theftse350
data and using the glimpse function to explore the data. Describe what you see? ClickRun Code
to see the data
tsfe::ftse350 %>% glimpse
What is the top 100 stocks in terms of market capitalisation? Challenge: filter the data to rank stocks on market size (Hint: some code to start
ftse350 %>% filter(variable=="Market Value") %>% group_by(date) %>%
)
Hint: Userank = min_rank(desc(value)))
to rank value
(for example) such that the largest market cap will be the top.
ftse350 %>% filter(variable=="Market Value") %>% group_by(date) %>% mutate(rank = min_rank(desc(value))) %>% filter(rank %in% c(1:100))
Extract the prices and market values for the following three stocks. Name the resultant dataframe port
tickers<-c("GRG","BT.A","ULVR") ## these are the tickers of the above stocks
tickers<-c("GRG","BT-A","ULVR") ftse350 %>% filter(ticker %in% tickers) -> port
The port data which you created is in tidy
form, which is one observation per row. For the purposes of portfolio analytics we will need to filter on one variable to create returns.
Create log returns for each daily price series using the mutate
function in the dplyr
package. Recall the formula
$$r_t=ln(P_t)-ln(P_{t-1})$$
Hint: use the lag()
in the mutate
to call $P_{t-1}$
port_ret <- port %>% filter(variable=="Price") %>% arrange(ticker,date) %>% # ensure data is order by stock and then chronologically group_by(ticker) %>% # this ensures the lag operating only operate with each symbol mutate(log_return=log(value)-log(lag(value))) port_ret %>% head() # A sanity check to ensure that log returns are calculated correctly
Plot each log returns series in such a way as to compare their volatility over time using the aesthetic colour
. Hint: try using facet_wrap
in ggplot2
to create separate plots.
port_ret %>% ggplot(aes(x=date,y=log_return,colour=ticker)) + geom_line()
Plot each log returns series in such a way as to compare their volatility over time. Hint: try using facet_wrap
in ggplot2
to create separate plots.
port_ret %>% ggplot(aes(x=date,y=log_return,colour=ticker)) + geom_line() + facet_wrap(~ticker)
Another way to compare these series is to consider the extreme (or outlying) returns. Conventionally, you might want to consider values greater than 95^th^ percentile by using stat_peak()
and stat_valley()
from the ggpmisc
package. The latter package may have to be installed on your machine.
# install.packages("ggpmisc") library(ggpmisc) port_ret %>% ggplot(aes(x=date,y=log_return,colour=ticker)) + geom_line() + stat_peaks(colour = "red",ignore_threshold = 0.95) + stat_valleys(colour = "blue",ignore_threshold =0.95) + facet_wrap(~ticker)
From the plot we can see that Greggs has 2 days where the returns are above the 95th percentile of peak, while BT and ULVR have one.Furthermore, Greggs and ulvr have 2 days where returns are below the 95th percentile of trough. When doing your project how would to deal with outlying observations?
In our experience, the unexpected is usually not an "outlier", or an aberrant point by rather a systematic pattern in some part of the data - Gelman et al. (2020)
This topic you will calculate two daily portfolio return series for a portfolio containing the three stocks using log returns and market value.
Create an equally weighted returns series for the three stocks, then plot the resultant return series Hint: the mean is a equally weighted statistic.
Then using the portfolio return formula
$$r_{p,t} \approx \sum_{i=1}^{N}w_ir_{it}$$
port_ret_ew <- port_ret %>% group_by(date) %>% summarise(ret=mean(log_return)) port_ret_ew %>% ggplot(aes(x=date,y=ret)) + geom_line()
Create an value weighted returns series for the three stocks, then plot the resultant return series Hint: use the data with both price and market value
$$w_{it}= \frac{V_{it}}{\sum V_{it}} \text{ where } V_{it}=Quantity \times P_{it}$$
Then using the portfolio return formula
$$r_{p,t} \approx \sum_{i=1}^{N}w_ir_{it}$$
port_w <- port %>% select(ticker,date,variable,value) %>% spread(variable,value)
port_ret_vw <-port_w %>% group_by(date) %>% mutate(weight=`Market Value`/sum(`Market Value`)) # Sanity check port_ret_vw %>% summarise(tw=sum(weight)) %>% filter(tw>1.0001) # market value weighted returns port_ret_vw <-port_ret_vw %>% arrange(ticker,date) %>% group_by(ticker) %>% mutate(log_return=log(Price)-log(lag(Price))) %>% group_by(date) %>% summarise(ret=sum(weight*log_return))
Combine the portfolio returns using left_join()
. Hint: you need to choose a merging variable which unique identifies the portfolio returns time series. Name the new object port_ret_both
port_ret_both<- port_ret_vw %>% rename(vw_ret=ret) %>% left_join(port_ret_ew %>% rename(ew_ret=ret),by="date")
Plot and visually compare the value-weight returns to the equally-weighted returns. Provide some rationale for the differences? Hint: use cumsum
to plot the wealth creation in the daily returns series
port_ret_both %>% ungroup() %>% drop_na() %>% mutate(wealth_vw=cumsum(vw_ret), wealth_ew=cumsum(ew_ret)) %>% select(date,wealth_vw,wealth_ew) %>% gather(port,wealth,-date) %>% ggplot(aes(x=date,y=wealth,colour=port)) + geom_line()
We have introduced the following graphics functions:
gglagplot
ggAcf
Explore the following time series from the tsfe
package using these functions. Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
ni_hsales_ts
usuk_rate_ts
ftse_m_ts
Firstly create time series objects of the data from tsfe
package. Note that ftse_m_ts
is already in the package data.
ni_hsales_ts<-ts(ni_hsales$`Total Verified Sales`,start = c(2005,1),frequency = 4) usuk_rate_ts<-ts(usuk_rate$price)
Plot and visually compare the above time series objects. Then use some visual aids to identify autocorrelation, What do you find? Hint: remember the autocorrelation plots from the lecture
autoplot(ni_hsales_ts) autoplot(usuk_rate_ts) autoplot(ftse_m_ts) gglagplot(ni_hsales_ts) gglagplot(usuk_rate_ts) gglagplot(ftse_m_ts) ggAcf(ni_hsales_ts) ggAcf(usuk_rate_ts) ggAcf(ftse_m_ts)
question("Explain when it is **not** appropriate to use the above approximation for portfolio returns", answer("When the returns are far from zero, for example when the time interval is long, for instance quarterly or annual returns which are likely to be far from zero", correct = TRUE), answer("When the returns are close to zero, for example when the time interval is long, for instance quarterly or annual returns which are likely to be far from zero."), answer("When the returns are far from zero, for example when the time interval is short, for instance daily."), answer("When the returns are far from 1"), allow_retry = TRUE )
question("Why would an analyst prefer to use a total returns rather than price returns when asessing the performance of a portfolio investment?", answer("Total returns capture the income gains from holding an stock"), answer("Total returns capture the income and capital gains from holding an stock",correct = TRUE), answer("Total returns capture the capital and price gains from holding an stock"), answer("Total returns capture capital income and CEO turnover activity"), allow_retry = TRUE )
question("What investment assumption does the total return calculation make?", answer("In the long term dividend income becomes an increasingly important part of the total return opportunity for investing.", correct = TRUE), answer("In the long term dividend income becomes an decreasingly important part of the total return opportunity for investing."), answer("In the long term dividend income is not an important part of the total return opportunity for investing."), answer("Total returns capture capital income and merger activity"), allow_retry = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.