ssm | R Documentation |
Calculates the self-similarity matrix and novelty vector of a sound.
ssm(
x,
samplingRate = NULL,
from = NULL,
to = NULL,
windowLength = 25,
step = 5,
overlap = NULL,
ssmWin = NULL,
sparse = FALSE,
maxFreq = NULL,
nBands = NULL,
MFCC = 2:13,
input = c("mfcc", "melspec", "spectrum")[2],
norm = FALSE,
simil = c("cosine", "cor")[1],
kernelLen = 100,
kernelSD = 0.5,
padWith = 0,
summaryFun = c("mean", "sd"),
reportEvery = NULL,
cores = 1,
plot = TRUE,
savePlots = NULL,
main = NULL,
heights = c(2, 1),
width = 900,
height = 500,
units = "px",
res = NA,
specPars = list(levels = seq(0, 1, length = 30), colorTheme = c("bw", "seewave",
"heat.colors", "...")[2], xlab = "Time, s", ylab = "kHz"),
ssmPars = list(levels = seq(0, 1, length = 30), colorTheme = c("bw", "seewave",
"heat.colors", "...")[2], xlab = "Time, s", ylab = "Time, s"),
noveltyPars = list(type = "b", pch = 16, col = "black", lwd = 3)
)
x |
path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors |
samplingRate |
sampling rate of |
from , to |
if NULL (default), analyzes the whole sound, otherwise from...to (s) |
windowLength |
length of FFT window, ms |
step |
you can override |
overlap |
overlap between successive FFT frames, % |
ssmWin |
window for averaging SSM, ms (has a smoothing effect and speeds up the processing) |
sparse |
if TRUE, the entire SSM is not calculated, but only the central region needed to extract the novelty contour (speeds up the processing) |
maxFreq |
highest band edge of mel filters, Hz. Defaults to
|
nBands |
number of warped spectral bands to use. Defaults to |
MFCC |
which mel-frequency cepstral coefficients to use; defaults to
|
input |
the spectral representation used to calculate the SSM |
norm |
if TRUE, the spectrum of each STFT frame is normalized |
simil |
method for comparing frames: "cosine" = cosine similarity, "cor" = Pearson's correlation |
kernelLen |
length of checkerboard kernel for calculating novelty, ms (larger values favor global, slow vs. local, fast novelty) |
kernelSD |
SD of checkerboard kernel for calculating novelty |
padWith |
how to treat edges when calculating novelty: NA = treat sound before and after the recording as unknown, 0 = treat it as silence |
summaryFun |
functions used to summarize each acoustic characteristic, eg "c('mean', 'sd')"; user-defined functions are fine (see examples); NAs are omitted automatically for mean/median/sd/min/max/range/sum, otherwise take care of NAs yourself |
reportEvery |
when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report) |
cores |
number of cores for parallel processing |
plot |
if TRUE, plots the SSM |
savePlots |
full path to the folder in which to save the plots (NULL = don't save, ” = same folder as audio) |
main |
plot title |
heights |
relative sizes of the SSM and spectrogram/novelty plot |
width , height , units , res |
graphical parameters for saving plots passed to
|
specPars |
graphical parameters passed to |
ssmPars |
graphical parameters passed to |
noveltyPars |
graphical parameters passed to
|
Returns a list of two components: $ssm contains the self-similarity matrix, and $novelty contains the novelty vector.
El Badawy, D., Marmaroli, P., & Lissek, H. (2013). Audio Novelty-Based Segmentation of Music Concerts. In Acoustics 2013 (No. EPFL-CONF-190844)
Foote, J. (1999, October). Visualizing music and audio using self-similarity. In Proceedings of the seventh ACM international conference on Multimedia (Part 1) (pp. 77-80). ACM.
Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. In Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on (Vol. 1, pp. 452-455). IEEE.
spectrogram
modulationSpectrum
segment
sound = c(soundgen(),
soundgen(nSyl = 4, sylLen = 50, pauseLen = 70,
formants = NA, pitch = c(500, 330)))
# playme(sound)
# detailed, local features (captures each syllable)
s1 = ssm(sound, samplingRate = 16000, kernelLen = 100,
sparse = TRUE) # much faster with 'sparse'
# more global features (captures the transition b/w the two sounds)
s2 = ssm(sound, samplingRate = 16000, kernelLen = 400, sparse = TRUE)
s2$summary
s2$novelty # novelty contour
## Not run:
ssm(sound, samplingRate = 16000,
input = 'mfcc', simil = 'cor', norm = TRUE,
ssmWin = 25, # speed up the processing
kernelLen = 300, # global features
specPars = list(colorTheme = 'seewave'),
ssmPars = list(col = rainbow(100)),
noveltyPars = list(type = 'l', lty = 3, lwd = 2))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.