dot-spectrogram: Spectrogram per sound

.spectrogramR Documentation

Spectrogram per sound

Description

Internal soundgen function called by spectrogram and analyze.

Usage

.spectrogram(
  audio,
  dynamicRange = 80,
  windowLength = 50,
  step = windowLength/2,
  overlap = NULL,
  specType = c("spectrum", "reassigned", "spectralDerivative")[1],
  rasterize = FALSE,
  wn = "gaussian",
  zp = 0,
  normalize = TRUE,
  smoothFreq = 0,
  smoothTime = 0,
  qTime = 0,
  percentNoise = 10,
  noiseReduction = 0,
  output = c("original", "processed", "complex", "all")[1],
  plot = TRUE,
  osc = c("none", "linear", "dB")[2],
  heights = c(3, 1),
  ylim = NULL,
  yScale = "linear",
  contrast = 0.2,
  brightness = 0,
  blur = 0,
  maxPoints = c(1e+05, 5e+05),
  padWithSilence = TRUE,
  colorTheme = c("bw", "seewave", "heat.colors", "...")[1],
  col = NULL,
  extraContour = NULL,
  xlab = NULL,
  ylab = NULL,
  xaxp = NULL,
  mar = c(5.1, 4.1, 4.1, 2),
  main = NULL,
  grid = NULL,
  width = 900,
  height = 500,
  units = "px",
  res = NA,
  internal = NULL,
  ...
)

Arguments

audio

a list returned by readAudio

dynamicRange

dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero

windowLength

length of FFT window, ms

step

you can override overlap by specifying FFT step, ms (NB: because digital audio is sampled at discrete time intervals of 1/samplingRate, the actual step and thus the time stamps of STFT frames may be slightly different, eg 24.98866 instead of 25.0 ms)

overlap

overlap between successive FFT frames, %

specType

plot the original FFT ('spectrum'), reassigned spectrogram ('reassigned'), or spectral derivative ('spectralDerivative')

rasterize

(only applies if specType = 'reassigned') if TRUE, the reassigned spectrogram is plotted after rasterizing it: that is, showing density per time-frequency bins with the same resolution as an ordinary spectrogram

wn

window type accepted by ftwindow, currently gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop

zp

window length after zero padding, points

normalize

if TRUE, scales input prior to FFT

smoothFreq, smoothTime

length of the window for median smoothing in frequency and time domains, respectively, points

qTime

the quantile to be subtracted for each frequency bin. For ex., if qTime = 0.5, the median of each frequency bin (over the entire sound duration) will be calculated and subtracted from each frame (see examples)

percentNoise

percentage of frames (0 to 100%) used for calculating noise spectrum

noiseReduction

how much noise to remove (non-negative number, recommended 0 to 2). 0 = no noise reduction, 2 = strong noise reduction: spectrum - (noiseReduction * noiseSpectrum), where noiseSpectrum is the average spectrum of frames with entropy exceeding the quantile set by percentNoise

output

specifies what to return: nothing ('none'), unmodified spectrogram ('original'), denoised and/or smoothed spectrogram ('processed'), or unmodified spectrogram with the imaginary part giving phase ('complex')

plot

should a spectrogram be plotted? TRUE / FALSE

osc

"none" = no oscillogram; "linear" = on the original scale; "dB" = in decibels

heights

a vector of length two specifying the relative height of the spectrogram and the oscillogram (including time axes labels)

ylim

frequency range to plot, kHz (defaults to 0 to Nyquist frequency). NB: still in kHz, even if yScale = bark, mel, or ERB

yScale

scale of the frequency axis: 'linear' = linear, 'log' = logarithmic (musical), 'bark' = bark with hz2bark, 'mel' = mel with hz2mel, 'ERB' = Equivalent Rectangular Bandwidths with HzToERB

contrast

spectrum is exponentiated by contrast (any real number, recommended -1 to +1). Contrast >0 increases sharpness, <0 decreases sharpness

brightness

how much to "lighten" the image (>0 = lighter, <0 = darker)

blur

apply a Gaussian filter to blur or sharpen the image, two numbers: frequency (Hz), time (ms). A single number is interpreted as frequency, and a square filter is applied. NA / NULL / 0 means to blurring in that dimension. Negative numbers mean un-blurring (sharpening) the image by dividing instead of multiplying by the filter during convolution

maxPoints

the maximum number of "pixels" in the oscillogram (if any) and spectrogram; good for quickly plotting long audio files; defaults to c(1e5, 5e5)

padWithSilence

if TRUE, pads the sound with just enough silence to resolve the edges properly (only the original region is plotted, so the apparent duration doesn't change)

colorTheme

black and white ('bw'), as in seewave package ('seewave'), or any palette from palette such as 'heat.colors', 'cm.colors', etc

col

actual colors, eg rev(rainbow(100)) - see ?hcl.colors for colors in base R (overrides colorTheme)

extraContour

a vector of arbitrary length scaled in Hz (regardless of yScale!) that will be plotted over the spectrogram (eg pitch contour); can also be a list with extra graphical parameters such as lwd, col, etc. (see examples)

xlab, ylab, main, mar, xaxp

graphical parameters for plotting

grid

if numeric, adds n = grid dotted lines per kHz

width, height, units, res

graphical parameters for saving plots passed to png

internal

a long list of stuff for plotting pitch contours passed by analyze()

...

other graphical parameters


soundgen documentation built on Sept. 29, 2023, 5:09 p.m.