In rmaestre/variableStars: Efficient software to study patterns on variable stars.

knitr::opts_chunk$set(echo = TRUE)
library(variableStars)
library(data.table)
library(ggplot2)
library(microbenchmark)
library(plotly)
library(RColorBrewer)

Library Introduction

Talk about variable stars ... (tbd)

plot_spectrum_ggplot <- function(min, max, dt) {
  max_amplitude <- dt[which.max(dt$amplitude), ]
  # Plot
  p <- ggplot(aes(frequency, amplitude), data = dt) +
    geom_point(aes(alpha = 0.6)) +
    geom_point(aes(frequency, amplitude, colour = "#009988"),
               data = max_amplitude) +
    geom_line() +
    theme_bw() + ylab("Amplitude") + xlab("Frecuency") +
    ggtitle(paste(
      "First Frecuency (F:",
      round(max_amplitude$frequency, 4),
      ", A:",
      round(max_amplitude$amplitude, 4),
      ")",
      sep = ""
    )) +
    xlim(min, max) +
    theme(legend.position = "none")
  return(p)
}

Introduction

The Fourier transform (FT) decomposes a function of time (a signal) into the frequencies that make it up. In Astrophysics, specially in Pulsars study, this technique suposes the main tool to study it patterns, and therefore, clasiffy this kind of stars. The main formula of Discrete Fourier Transform is:

$$ F_n = \sum_{k=0}^{N-1} f_k \cdot e^{-2 \pi \cdot i \cdot n \cdot \frac{k}{N}}$$

Examples

1. Simple signal. Sin.

The first signal is the sin one. This signal is very clear and the period can be calculated visually.

x <- sin(seq(from = 0,
             to = 18.84,
             by = 0.01))
# Add secuential times
dt.test <-
  data.frame("time" = seq(from = 1, to = length(x)), "x" = x)
ggplot(aes(time, x), data = dt.test) +
  geom_point() +
  geom_line() +
  theme_bw()

We campute the FT, calculating the amplitudes.

# Compute DFT
dt.spectrum <- data.frame(calculateSpectrum(dt.test$time, dt.test$x))
# Get max amplitude
maxAmplitude <- dt.spectrum[which.max(dt.spectrum$amplitude),]
# Plot amplitudes
plot_spectrum_ggplot(0, 0.05, dt.spectrum)

Therefore, the period is $p=\frac{1}{F}$ = r 1/maxAmplitude$frequency

# Plot amplitudes
ggplot(aes(time, x), data = dt.test) +
  geom_point() +
  geom_line() +
  theme_bw() +
  geom_vline(xintercept = 1/maxAmplitude$frequency)

#inverseFt <- apodizationFt(frequences = dt.spectrum$frequency, filter = "uniform")

r <- process(dt.spectrum$frequency, dt.spectrum$amplitude, filter = "uniform", 
             gRegimen=0, numFrequencies=100,
             maxDnu=100, minDnu=0, dnuGuessError=T)

plot_periodicities(prepare_periodicities_dataset(r$fresAmps))

2. A noisy signal example

Another synthetic example, where a more noisy signal is provided is the next one:

## noisy signal with amplitude modulation
x <- seq(from = 0,
         to = 1,
         length.out = 200)
# original data
y_org <- (1 + sin(2 * 2 * pi * x)) * sin(20 * 2 * pi * x)
# overlay some noise
x <- y_org + rnorm(length(x), sd = 0.2)
# Add secuential times
dt.test <-
  data.frame("time" = seq(from = 1, to = length(x)), "x" = x)
ggplot(aes(time, x), data = dt.test) +
  geom_point() +
  geom_line() +
  theme_bw()

We use DFT to calculate the amplitude in each frecuency.

# Compute DFT
dt.spectrum <- data.frame(calculateSpectrum(dt.test$time, dt.test$x))
# Get max amplitude
maxAmplitude <- dt.spectrum[which.max(dt.spectrum$amplitude),]
# Plot amplitudes
plot_spectrum_ggplot(0, 0.5, dt.spectrum)

Therefore, the period is $p=\frac{1}{F}$ = r 1/maxAmplitude$frequency

3. Real data from a pulsar star

In this case, we use the photometry of a pulsar star, in which timestamp the magnitude of the pulsar is given. In this case, analize the patter in visually complex.

# Read pulsar data
dt.pulsar <- data.table(read.csv("../data/pulsar.tsv", sep = "\t"))
ggplot(aes(time, mmag), data = dt.pulsar[sample(nrow(dt.pulsar), 1000),]) +
  geom_point() +
  geom_line() +
  theme_bw()

We use DFT to calculate the amplitude in each frecuency.

# Calculate
dt.spectrum <-
  data.frame(calculateSpectrum(dt.pulsar$time, dt.pulsar$mmag))
# Get max amplitude
maxAmplitude <- dt.spectrum[which.max(dt.spectrum$amplitude),]
# Plot amplitudes
plot_spectrum_ggplot(20, 25, dt.spectrum)

Therefore, the period of this pulsar is $p=\frac{1}{F}$ = r 1/maxAmplitude$frequency.

r <- process(dt.spectrum$frequency, dt.spectrum$amplitude, filter = "uniform", 
             gRegimen=0, numFrequencies=30,
             maxDnu=100, minDnu=0, dnuGuessError=T)

plot_periodicities_ggplot(prepare_periodicities_dataset(r$fresAmps))

Benchmark

A benchmark is proposed to show the performance achieved by made all calculations witn C++ and Armadillo (RcppArmadillo). We achieve an average value of $310 \text{ ms}$ to compute a DFT on r nrow(dt.pulsar) time values.

m <-
  microbenchmark(dt <-
                   data.frame(calculateSpectrum(dt.pulsar$time, dt.pulsar$mmag)),
                 times = 50)
autoplot(m, log = F) +
  scale_x_discrete(labels = c("DFT on a Pulsar star data")) +
  xlab("")