invertSpectrogram: Invert spectrogram

View source: R/invertSpectrogram.R

invertSpectrogramR Documentation

Invert spectrogram


Transforms a spectrogram into a time series with inverse STFT. The problem is that an ordinary spectrogram preserves only the magnitude (modulus) of the complex STFT, while the phase is lost, and without phase it is impossible to reconstruct the original audio accurately. So there are a number of algorithms for "guessing" the phase that would produce an audio whose magnitude spectrogram is very similar to the target spectrogram. Useful for certain filtering operations that modify the magnitude spectrogram followed by inverse STFT, such as filtering in the spectrotemporal modulation domain.


  step = NULL,
  wn = "hanning",
  specType = c("abs", "log", "dB")[1],
  initialPhase = c("zero", "random", "spsi")[3],
  nIter = 50,
  normalize = TRUE,
  play = TRUE,
  verbose = FALSE,
  plotError = TRUE



the spectrogram that is to be transform to a time series: numeric matrix with frequency bins in rows and time frames in columns


sampling rate of x (only needed if x is a numeric vector)


length of FFT window, ms


overlap between successive FFT frames, %


you can override overlap by specifying FFT step, ms (NB: because digital audio is sampled at discrete time intervals of 1/samplingRate, the actual step and thus the time stamps of STFT frames may be slightly different, eg 24.98866 instead of 25.0 ms)


window type accepted by ftwindow, currently gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop


the scale of target spectroram: 'abs' = absolute, 'log' = log-transformed, 'dB' = in decibels


initial phase estimate: "zero" = set all phases to zero; "random" = Gaussian noise; "spsi" (default) = single-pass spectrogram inversion (Beauregard et al., 2015)


the number of iterations of the GL algorithm (Griffin & Lim, 1984), 0 = don't run


if TRUE, normalizes the output to range from -1 to +1


if TRUE, plays back the reconstructed audio


if TRUE, prints estimated time left every 10% of GL iterations


if TRUE, produces a scree plot of squared error over GL iterations (useful for choosing 'nIter')


Algorithm: takes the spectrogram, makes an initial guess at the phase (zero, noise, or a more intelligent estimate by the SPSI algorithm), fine-tunes over 'nIter' iterations with the GL algorithm, reconstructs the complex spectrogram using the best phase estimate, and performs inverse STFT. The single-pass spectrogram inversion (SPSI) algorithm is implemented as described in Beauregard et al. (2015) following the python code at The Griffin-Lim (GL) algorithm is based on Griffin & Lim (1984).


Returns the reconstructed audio as a numeric vector.


  • Griffin, D., & Lim, J. (1984). Signal estimation from modified short-time Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(2), 236-243.

  • Beauregard, G. T., Harish, M., & Wyse, L. (2015, July). Single pass spectrogram inversion. In 2015 IEEE International Conference on Digital Signal Processing (DSP) (pp. 427-431). IEEE.

See Also

spectrogram filterSoundByMS


# Create a spectrogram
samplingRate = 16000
windowLength = 40
overlap = 75
wn = 'hanning'

s = soundgen(samplingRate = samplingRate, addSilence = 100)
spec = spectrogram(s, samplingRate = samplingRate,
  wn = wn, windowLength = windowLength, overlap = overlap,
  padWithSilence = FALSE, output = 'original')

# Invert the spectrogram, attempting to guess the phase
# Note that samplingRate, wn, windowLength, and overlap must be the same as
# in the original (ie you have to know how the spectrogram was created)
s_new = invertSpectrogram(spec, samplingRate = samplingRate,
  windowLength = windowLength, overlap = overlap, wn = wn,
  initialPhase = 'spsi', nIter = 10, specType = 'abs', play = FALSE)

## Not run: 
# Verify the quality of audio reconstruction
# playme(s, samplingRate); playme(s_new, samplingRate)
spectrogram(s, samplingRate, osc = TRUE)
spectrogram(s_new, samplingRate, osc = TRUE)

## End(Not run)

soundgen documentation built on Aug. 14, 2022, 5:05 p.m.