knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette documents the use of the Morlet wavelet decomposition/recomposition functions from myUtils
for future reference. I wrote functions for performing wavelet analysis based on the Morlet wavelet because I found no package that allowed to easily filter out unwanted scales and reconstruct timeseries corresponding to scales of interest only.
The code was written based on the paper Torrence and Campo, 1998 and its implementation proposed in a Stack Overflow post.
This vignette estimates time series of amplitudes and timings of periodicities of interest using wavelet decomposition and reconstruction.
library(myUtils)
First, let's first define an toy periodic time series.
# example of monthly time series with biyearly, yearly and biennial periodicities y <- 0.3*cos(2*pi/6*1:100) + # biyearly 0.5*cos(pi/4+2*pi/12*1:100) + # yearly 0.2*cos(pi/4+2*pi/24*1:100) # biennial # add some noise y <- y + 0.1*rnorm(100) # define corresponding time vector t <- seq.Date(from = as.Date('2000-01-01'), by = "1 month", length.out = length(y)) # visualize plot(x = t,y = y,type='l')
The wavelet transform (Equation 2 in Torrence and Campo, 1998) is defined in the function myUtils::wt
. It outputs a matrix, where the rows correspond to time points and the columns correspond to periodicities. If the vector of time points vect_time
and the time resolution dt
are provided as arguments to the function, corresponding values of time and periodicities are assigned to column and row names in the output matrix.
mat_wt <- wt(y = y,vect_time = t,dt=1/12) str(mat_wt)
In the output, each element corresponds to the wavelet coefficient for the time point vect_time[i]
, and the scale s defined as dt*2*2^(j*.125)
.
When vect_time
and dt
are provided as arguments to the function wt
, plot_re
makes it easy to visualize the real part of the wavelet decomposition as a function of time and scale.
library(ggplot2) print(plot_re(mat_wt) + geom_hline(aes(yintercept=c(0.4,0.6, 0.8,1.2, 1.7,2.3))) + scale_y_continuous(trans='log10', breaks = c(0.5,1,2), limits = c(2/12,3)))
Positive values (in red) indicate that the time series and the Morlet wavelet are in phase at the time and scale of interest, negative values (in blue) rather indicate that they are in phase opposition.
Here, the wavelet decomposition correctly identifies the 3 main periodicities in the time series, as well as the corresponding strengths.
df_pow <- mat_2_df(calc_power(mat_wt)) head(df_pow) print(ggplot() + geom_tile(data = df_pow, mapping = aes(x=date,y=period,fill=value)) + geom_hline(aes(yintercept=c(0.4,0.6, 0.8,1.2, 1.7,2.3))) + scale_y_continuous(trans='log10', breaks = c(0.5,1,2), limits = c(0.2,4)) + scale_fill_distiller(palette = "Spectral") + theme_bw() + ggtitle('power spectrum'))
Before filtering out specific periodicities from the original time series, we first want to make sure that the time series reconstructed using all frequencies is equal to the original time series, so that the deconstruction reconstruction combine to the identity function.
The inverse wavelet transform is defined in the function wt_inverse
. wt_inverse
provides the option to specify the mean of the original time series y.m
: this mean value will be added to the output of the centered wavelet reconstruction.
rec_wt <- wt_inverse(mat_wt = mat_wt,y.m = mean(y)) plot(x = t,y = rec_wt,type='l',ylab='y') lines(x = t,y = y,col='red',lty=2) legend('topright', legend=c("original", "reconstructed"), col=c("black", "red"), lty=1:2)
The original time series (in black) and the reconstructed time series (in red) align perfectly.
For example, let's consider that we are interested in the studying the time series with a 1-year periodicity.
The function wt_extract
extracts wavelet coefficients within range of scales provided by range_scale
. The output is a matrix with the same dimension as provided in the argument mat_wt
, but where wavelet coefficients outside of the provided scale range are set to 0.
Here, we extract coefficients in the period range from 0.8 to 1.2 years, which contains the 1-year periodicity signal according to the Figure showing wavelet coefficients.
The one-year periodicity reconstruction is obtained by applying the inverse wavelet transform function to the matrix containing selected coefficients.
mat_wt_1y <- wt_extract(mat_wt = mat_wt, range_scale = c(0.8,1.2)) rec_wt_1y <- wt_inverse(mat_wt = mat_wt_1y,y.m = mean(y)) # plot reconstructed time series plot(x = t,y = rec_wt,type='l',ylab='y') lines(x = t,y = y,col='red',lty=2) lines(x = t,y = rec_wt_1y,col='green') legend('topright', legend=c("original", "reconstructed","1 year"), col=c("black", "red","green"), lty=c(1,2,1))
rec_wt_6m <- wt_inverse(mat_wt = wt_extract(mat_wt = mat_wt, range_scale = c(0.4,0.6)), y.m = mean(y)) rec_wt_1y <- wt_inverse(mat_wt = wt_extract(mat_wt = mat_wt, range_scale = c(0.8,1.2)), y.m = mean(y)) rec_wt_2y <- wt_inverse(mat_wt = wt_extract(mat_wt = mat_wt, range_scale = c(1.7,2.3)), y.m = mean(y)) df_all <- rbind(data.frame(t=t,y=y,periodicity='original'), data.frame(t=t,y=rec_wt_6m,periodicity='6m'), data.frame(t=t,y=rec_wt_1y,periodicity='1y'), data.frame(t=t,y=rec_wt_2y,periodicity='2y')) ggplot(data = df_all,aes(x=t,y=y,col=periodicity)) + geom_line() + theme_bw()
The reconstruction captures correctly that the main component in the data is the yearly component, then the biyearly and the biennal.
The final goal of the study is to extract amplitudes and timings for specific periodicities and for different times in the study period.
date_start <- as.Date('2000-01-01') date_end <- as.Date('2008-01-01') # biyearly decomposition ts_pow_6m <- ts_power(mat_wt = mat_wt, dates_seq = seq.Date(from = date_start, to = date_end, by = "6 months"), range_scale = c(0.4,0.6)) # yearly decomposition ts_pow_1y <- ts_power(mat_wt = mat_wt, dates_seq = seq.Date(from = date_start, to = date_end, by = "1 year"), range_scale = c(0.8,1.2)) # biennial decomposition ts_pow_2y <- ts_power(mat_wt = mat_wt, dates_seq = seq.Date(from = date_start, to = date_end, by = "1 year"), range_scale = c(1.5,3)) # plot all in same figure ts_all <- rbind(data.frame(ts_pow_6m,periodicity="6m"), data.frame(ts_pow_1y,periodicity="1y"), data.frame(ts_pow_2y,periodicity="2y")) ggplot(data = ts_all, mapping = aes(x = date_start,y=power,col=periodicity)) + geom_line() + geom_point() + labs(y = "normalized power") + theme_bw()
Again, the power analysis correctly spotted that the strongest periodcity is the 1 year periodicity, followed by the 6 months periodity, and followed by the 2 year periodicity.
The problem with averaging the power over a scale band is that the result is highly sensitive to the width of the band. Therefore, I prefer to extract the time series of amplitudes from the reconstructed time series.
Second, extracting phases for a given periodicity consists in applying the calc_phase
function on selected coefficients, and calculating the mean for each time point over scales within the specified periodicity range.
phase_1y <- rowMeans(x = calc_phase_angle(wt_extract(mat_wt = mat_wt, range_scale = c(0.9,1.2), fill = NA)),na.rm = T) plot(x = 1:length(t),y = phase_1y,type='l')
Then, in order to identify the timing of the maximum amplitude, the function find_zeros_lin
can be applied in order to approximate time points when the phase angle is zero. The option bool_dec=F
specifies to not look for zeros at times when the function is decreasing (these correspond to minima of in the reconstructed time series).
# extract timings with phases 0 t_max <- find_zeros_lin(x = 1:length(t),f_x = phase_1y,bool_dec = F) dates_max <- as.Date(approx(x = 1:length(t), y = as.Date(rownames(mat_wt)), xout = t_max)$y, origin = as.Date("1970-01-01")) plot(x = as.Date(rownames(mat_wt)),y = phase_1y, type='l',xlab='date',ylab='phase') points(x = dates_max, y = rep(0,length(dates_max)), col = 'red') # check that they correspond to timings of maximum amplitudes plot(x = as.Date(rownames(mat_wt)), y = rec_wt_1y, type='l',xlab='date',ylab='1 year reconstruction') abline(v=dates_max,col="red")
Keep it simple : as a first approximation, extract maximum values around maximum timing
val_max <- rep(NA,length(dates_max)) for(i in 1:length(dates_max)){ #vector of times idx_change <- which( sign(t - dates_max[i]) == 1)[1] val_max[i] <- max(rec_wt_1y[idx_change-1],rec_wt_1y[idx_change]) } plot(x = as.Date(rownames(mat_wt)), y = rec_wt_1y, type='l',xlab='date',ylab='1 year reconstruction') points(x=dates_max,y=val_max,col="red")
The amplitude can then be calculated by substracting the mean of y
to val_max
(in this example the mean is zero).
This vignette shows how to extract time series of timing and amplitudes corresponding to specific periodicities of interest based on wavelet decomposition.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.