In barryquinn1/ATI: Algorithmic Trading and Investment

library(ati)
library(PortfolioAnalytics)
library(matlab)
library(corrplot)
library(tidyverse)
library(RColorBrewer)
library(skimr)
library(learnr)
library(fontawesome)
tutorial_options(exercise.timelimit = 60)
tutorial_options(exercise.eval = TRUE)
knitr::opts_chunk$set(echo = FALSE,warning=FALSE, message=FALSE)

Introduction

In finance, empirical covariance matrices are often numerically ill-conditioned, as a result of small number of independent observations used to estimate a large number of parameters. Working with those matrices directly, without treatment, is not recommended.

Even if the covariance matrix is non singular What is a singular matrice? {width="50%" align="center"} , and therefore invertible, the small determinant all but guarantees that the estimations error will be greatly magnified by the inversion process.

The practical implication is that these estimation errors cause misallocation of assets and substantial transaction costs due to unnecessary rebalancing. Furthermore, denoising the matrix $\bf{XX^{'}}$ before inverting it should help reduce the variance of regression estimates, and improve the power of statistical tests of hypothesis. For the same reason, covariance matrices derived from regressed factors (also known as factor-based covariance matrices) also require denoising, and should not be used without numerical treatment.

Before we begin

Login to your RStudio on Q-RaP with the credential provided by the lecturer
If you don't want to use this instance use your local machine RStudio IDE but it is your responsibility to keep it up-to-date. For local set see this set-up workshop here
Engage your Yoda growth mindset

{width="30%"}

Outline

In this workshop you can learn:

First time set-up of git on Q-RaP RStudio
Creating fake portfolio data
Examine correlations of the fake portfolio data
Creating a function in R
Setting up functions in R to denoise data using the Marcenko-Pastur distribution

Tools you will use

Queen' Management School Remote Analytics Desktop
Q-RaP RStudio Or your local RStudio IDE
Q-RaP RStudio Connect

I have preloaded the packages for this tutorial with

library(tidyverse) # loads dplyr, ggplot2, and others
library(PortfolioAnalytics)
library(matlab)
library(fontawesome)
library(corrplot)
library(RColorBrewer)
library(skimr)
library(ati)

Git integration

Ex 1: First time set-up

Register for account on GitHub (https://github.com/). We recommend using a username that incorporates your name (barryquinn1,ckelly66)
If you haven't already click on this invite https://classroom.github.com/a/GCR_J0yx to clone the repository for workshop 1.

Ex 2: Create RStudio project using Git

Use this video for guidance on the above set-up {width="50%"}

If it is your first time using git you need to run the following in the Terminal console, then repeat step 4

```{bash, eval=FALSE} git config --global user.email ""

This is the email you used to register with GitHub.

git config --global user.name ""

## Simulating fake data
In quantitative finance we do not have a laboratory where we can securely experiment in an environment that is controlled.  Most financial research is carried out on *Real* or **Big World** data which is complex, misbehaves and are uncontrollable.  Experimentation in finance is achieved by simulating **Small World** data with know statistical properties which can be controlled.

Portfolio data from the **Big World** is usually insufficient to produce meaningful results, this insufficiency can be illustrated but create some **Small World** random data.  

### Ex 1: Fake portfolio data

>Creates a portfolio in independently and identically distributed *fake* stock returns. Click `Run Code` to see a fake portfolio created:

```r
stocks=20
trading_days=40
fake_port <- array(
  rnorm(trading_days*stocks,
        mean = 0.01,sd = 0.01),
  dim = c(trading_days,stocks)) %>% 
  as.tibble()
fake_port %>% skim()

Describe the data?

The data is a sample of individual and identically distributed stock returns for 20 stocks over 40 trading days. The sample is drawn from a random normal distribution with mean 0.01 and standard deviation 0.01. This is the assumed data generating process of daily stock returns that the analyst has postulated.

Ex 3: Test your knowledge

question("what do you expect the correlation matrix of these portfolio to look like if the are drawn to be independent and identically distributed ?",
  answer("I expect there to be no pairwise correlation as the data is random"),
  answer("I expect there to be some real pairwise correlation as the data is random"),
  answer("I expect there to be some spurious pairwise correlation as the data is random", correct = TRUE),
  answer("I expect there to boe some real pairwise correlation as the data is nonrandom")
  ,allow_retry = TRUE
)

**Hint:** use `?rnorm()` in the console to understand the output of this function

Code pipes `%>%`

Firstly, I will introduce the process of piping code in R. The point of the pipe is to help you write code in a way that is easier to read and understand. To see why the pipe is so useful, we’re going to explore a number of ways of writing the same code. The pipe operator in R is %>% from the magrittr pacakge. For more details see Hadley 2020 "R for Data Science) Chapter 18

An algorithm for my morning routine

leave_home(get_dressed(get_out_of_bed(wake_up(me,time="6:30"),side="left"),trousers=TRUE,shirt=TRUE),car=FALSE,bike=TRUE,pandemic=FALSE)

With piping

me %>%
  wake_up(time="6:30") %>%
  get_out_of_bed(side="left") %>%
  get_dressed(trousers=TRUE,shirt=TRUE) %>%
  leave_house(car=FALSE,bike=TRUE,pandemic=FALSE)

So the piping operator allows the code to be more readable and logic.

Your turn

Rearrange this code using piping

## Recode this using piping 
summarise(group_by(mutate(fake_port,Type="Fake"),by="Type"),meanV1=mean(V1))

## Recode this using piping 
fake_port %>%
  mutate(Type="Fake")

## Recode this using piping 
fake_port %>%
  mutate(Type="Fake") %>%
  group_by(Type)

## Recode this using piping 
fake_port %>%
  mutate(Type="Fake") %>%
  group_by(Type) %>%
  summarise(meanV1=mean(V1))

Pairwise correlation of fake data

Given the fake portfolio was created by drawing independent and identically distributed random normal observations, by definition there should be no correlation between the fake stock returns.

Write some code to evaluate and visualise the correlation of the fake portfolio returns which can be access in the object fake_port

cor(fake_port)

cor(fake_port) %>%
  corrplot()

cor(fake_port) %>%
  corrplot(type="upper",
           method = "number",
           order="hclust",
           col=brewer.pal(n=8, name="RdYlBu"))

Building `r fa("r-project")` functions

Ex 1: simple function

R, at its heart, is a high level functional programming (FP) language. This means that it provides many tools for the creation and manipulation of functions.

Write a function to add two numbers together then test the function with numbers 1 and 2

add_numbers <- function(a, b) {

}

# Write a function to add two numbers together
add_numbers <- function(a, b) {
 a + b 
}
add_numbers(1,2)

Ex 2: Advanced function

Create a function in R for marcenko pastur distribution estimates

The Marcenko-Pastur distribution can be defined as:

$$\rho\left(\lambda \right) = \begin{cases} \frac{T}{N}\frac{\sqrt {\left( {{\lambda {+}} - \lambda} \right)\left( {\lambda - {\lambda {- }}} \right)}}{2\pi \lambda {\sigma ^2}}, & \text{if } \lambda \in [\lambda {+},\lambda {-}] \ 0, & \text{if } \lambda \notin [\lambda {+},\lambda {-}] \end{cases}$$

where the maximum expected eigenvalue is $\lambda_{+}=\sigma^2(1+\sqrt{N/T})^2$ and the minimum expected eigenvalue is $\lambda_{-}=\sigma^2(1-\sqrt{N/T})^2$

The following translates the above maths into R code.

mp_pdf<-function(var,t,m,pts) {
  q=t/m
  eMin<-var*(1-(1./q)^.5)^2 
  eMax<-var*(1+(1./q)^.5)^2 
  eVal<-linspace(eMin,eMax,pts)
  pd<-q/(2*pi*var*eVal)*((eMax-eVal)*(eVal-eMin))^.5
  pdf<-tibble(pd=pd,e=eVal) 
  return(pdf)  
}

Ex 3: Test mp_pdf

Test function to create the Marcenko Pastur distribution for the fake portfolio when the variance=1.

mp_pdf<-function(var,t,m,pts) {
  q=t/m
  eMin<-var*(1-(1./q)^.5)^2 
  eMax<-var*(1+(1./q)^.5)^2 
  eVal<-linspace(eMin,eMax,pts)
  pd<-q/(2*pi*var*eVal)*((eMax-eVal)*(eVal-eMin))^.5
  pdf<-tibble(pd=pd,e=eVal) 
  return(pdf)  
}

mp<-mp_pdf(1,trading_days,stocks,stocks)

Ex 4: plot distributoin

Research how the package ggplot2 works and then attempt to plot the distribution created earlier.

mp %>% 
  ggplot(aes(x=e,y=pd)) + 
  geom_line()

barryquinn1/ATI documentation built on May 10, 2021, 10:47 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

barryquinn1/ATI
Algorithmic Trading and Investment

In barryquinn1/ATI: Algorithmic Trading and Investment

Introduction

Before we begin

Outline

Tools you will use

Git integration

Ex 1: First time set-up

Ex 2: Create RStudio project using Git

This is the email you used to register with GitHub.

Ex 3: Test your knowledge

Code pipes `%>%`

Your turn

Pairwise correlation of fake data

Building `r fa("r-project")` functions

Ex 1: simple function

Ex 2: Advanced function

Ex 3: Test mp_pdf

Ex 4: plot distributoin

R Package Documentation

Browse R Packages

We want your feedback!

barryquinn1/ATI Algorithmic Trading and Investment

In barryquinn1/ATI: Algorithmic Trading and Investment

Introduction

Before we begin

Outline

Tools you will use

Git integration

Ex 1: First time set-up

Ex 2: Create RStudio project using Git

This is the email you used to register with GitHub.

Ex 3: Test your knowledge

Code pipes %>%

Your turn

Pairwise correlation of fake data

Building r fa("r-project") functions

Ex 1: simple function

Ex 2: Advanced function

Ex 3: Test mp_pdf

Ex 4: plot distributoin

R Package Documentation

Browse R Packages

We want your feedback!

barryquinn1/ATI
Algorithmic Trading and Investment

Code pipes `%>%`

Building `r fa("r-project")` functions